Model × Framework Matrix
CRR scores at every model-framework intersection, derived from agent evaluations. Click any cell to see the agent detail.
4 of 66 baselines evaluated
See a gap? Fill it.
Benchmark your agent against real scenarios. Any framework — Anthropic, OpenAI, LangGraph, or raw HTTP.
pip install crtf · 30-line quickstart · Free during beta