298 models in catalog

AI models

Every model we track — frontier flagships, open-weights specialists, narrow benchmarks-only releases. Filter by lab, country, access, modality, or release window. Sorted by newest by default; head to the leaderboard for the ranked view.

Leaderboard →Labs →Benchmarks →

Updated May 29, 2026 · Benchmarks via Artificial Analysis, specs & pricing via OpenRouter · Methodology · Spotted an error?

Overview Reasoning Coding Math Agents Multimodal General Long Context

Rank	Model	Agents idx ↓	τ²-bench	BFCL	τ²-bench Airline	τ²-bench Retail	BrowseComp	TAU-bench Airline	TAU-bench Retail	Released	Country	Type	Access	Params	Cutoff	Context	Speed	Latency	In $/M	Out $/M
#76	GPT-5.1-Codex OpenAI	83	83	—	—	—	—	—	—	2025	—	multimodal	API only	—	—	400K	188	4.16	$1.25	$10.00
#77	MiniCPM5-1BNew OpenBMB	82.5	82.5	—	—	—	—	—	—	2026	—	llm	—	—	—	—	—	—	$0.00	$0.00
#78	GPT-5.1 OpenAI	81.9	81.9	—	—	—	—	—	—	2025	—	llm	API only	—	—	400K	115	0.77	$1.25	$10.00
#79	Qwen3.5 2B Alibaba	81.6	81.6	—	—	—	—	—	—	2026	—	llm	—	—	—	—	328	0.24	$0.00	$0.10
#80	Command A Cohere	80.7	80.7	—	—	—	—	—	—	2025	—	llm	Open weights	—	2024	256K	203	0.17	$2.50	$10.00
#81	Gemini 3 Flash Google	80.4	80.4	—	—	—	—	—	—	2025	—	multimodal	API only	—	—	1M	191	1.05	$0.50	$3.00
#82	Nova 2.0 Omni Amazon	80.4	80.4	—	—	—	—	—	—	2025	—	llm	—	—	—	—	—	—	$0.30	$2.50
#83	Claude Sonnet 4.6 Anthropic	79.5	79.5	—	—	—	—	—	—	2026	—	llm	API only	—	—	1M	75	1.13	$3.00	$15.00
#84	Qwen3 Coder Next Alibaba	79.5	79.5	—	—	—	—	—	—	2026	—	llm	Open weights	—	—	262K	92	1.14	$0.11	$0.80
#85	LongCat Flash Lite LongCat	79.5	79.5	—	—	—	—	—	—	2026	—	llm	—	—	—	—	110	5.59	$0.00	$0.00
#86	Claude Sonnet 4.5 Anthropic	78.1	78.1	—	—	—	—	70	86.2	2025	—	llm	API only	—	2025	1M	42	0.40	$3.00	$15.00
#87	EXAONE 4.5 33B LG AI Research	78.1	78.1	—	—	—	—	—	—	2026	—	llm	—	—	—	—	—	—	$0.00	$0.00
#88	GPT-5.4 nano OpenAI	76	76	—	—	—	—	—	—	2026	—	llm	API only	—	2025	400K	157	0.55	$0.20	$1.25
#89	Nova 2 Lite Amazon	75.7	75.7	—	—	—	—	—	—	2025	—	multimodal	API only	—	—	1M	229	0.89	$0.30	$2.50
#90	Grok Code Fast 1 xAI	75.7	75.7	—	—	—	—	—	—	2025	—	llm	—	—	—	—	—	—	$0.00	$0.00
#91	Grok 4 xAI	74.9	74.9	—	—	—	—	—	—	2025	—	llm	API only	—	2024	256K	100	0.70	$3.00	$15.00
#92	K-EXAONE LG AI Research	74.3	74.3	—	—	—	—	—	—	2025	—	llm	—	—	—	—	—	—	$0.00	$0.00
#93	Qwen3 Max Alibaba	74.3	74.3	—	—	—	—	—	—	2025	—	llm	API only	—	2025	262K	45	1.71	$0.78	$3.90
#94	Kimi K2 0905 Moonshot AI	73.4	73.4	—	—	—	—	—	—	2025	—	llm	API only	1000000000000	—	262K	16	1.94	$0.60	$2.50
#95	Claude Opus 4 Anthropic	71.5	73.4	—	—	—	—	59.6	81.4	2025	—	llm	API only	—	2025	200K	120	0.40	$15.00	$75.00
#96	GPT-5 OpenAI	71.3	86.5	—	62.6	81.1	54.9	—	—	2025	—	llm	API only	—	2024	400K	100	2.00	$1.25	$10.00
#97	GPT-5 mini OpenAI	71.1	71.1	—	—	—	—	—	—	2025	—	llm	API only	—	2024	400K	200	1.00	$0.25	$2.00
#98	Mercury 2 Inception	70.8	70.8	—	—	—	—	—	—	2026	—	llm	API only	—	—	128K	790	6.11	$0.25	$0.75
#99	Claude Opus 4.1 Anthropic	69.9	71.4	—	—	—	—	56	82.4	2025	—	llm	API only	—	2025	200K	120	0.40	$15.00	$75.00
#100	Apriel-v1.6-15B-Thinker ServiceNow	69.3	69.3	—	—	—	—	—	—	2025	—	llm	—	—	—	—	—	—	$0.00	$0.00

Ranked on Agents. Cell colors show relative standing within each column (red → yellow → green). Scores are curated approximations — see each model for sources.