298 models in catalog

AI models

Every model we track — frontier flagships, open-weights specialists, narrow benchmarks-only releases. Filter by lab, country, access, modality, or release window. Sorted by newest by default; head to the leaderboard for the ranked view.

Leaderboard →Labs →Benchmarks →

Updated May 29, 2026 · Benchmarks via Artificial Analysis, specs & pricing via OpenRouter · Methodology · Spotted an error?

Overview Reasoning Coding Math Agents Multimodal General Long Context

Rank	Model	Multi idx ↓	AI2D	MMMU-Pro	ChartQA	DocVQA	MathVista	MMMU	Released	Country	Type	Access	Params	Cutoff	Context	Speed	Latency	In $/M	Out $/M
#1	o4-mini OpenAI	82.9	—	—	—	—	84.3	81.6	2025	—	multimodal	API only	—	2024	200K	115	5.20	$1.10	$4.40
#2	o3 OpenAI	82	—	76.4	—	—	86.8	82.9	2025	—	llm	API only	—	2024	200K	50	20.00	$2.00	$8.00
#3	Pixtral Large Mistral AI	81.7	93.8	—	88.1	93.3	69.4	64	2024	—	multimodal	API only	—	2024	131K	0	0.50	$2.00	$6.00
#4	Nova Pro Amazon	81.5	—	—	89.2	93.5	—	61.7	2024	—	multimodal	API only	—	—	300K	100	0.50	$0.80	$3.20
#5	GPT-5 OpenAI	81.3	—	78.4	—	—	—	84.2	2025	—	llm	API only	—	2024	400K	100	2.00	$1.25	$10.00
#6	Llama 4 Scout Meta	80.8	—	—	88.8	94.4	70.7	69.4	2025	—	multimodal	Open weights	109B total / 17B active (MoE)	2024	10M	776	0.31	$0.08	$0.30
#7	Gemini 2.5 Flash Google	79.7	—	—	—	—	—	79.7	2025	—	multimodal	API only	—	2025	1M	85	0.70	$0.30	$2.50
Index 41.9 = (44.3 + 47.8 + 43.8 + 31.6 / 4) — equal-weighted mean of 4 components. General25% 44.3 SimpleQA26.9 AA-LCR61.7 LongBench-v2— IFBench— Reasoning25% 47.8 GPQA Diamond82.8 Humanity’s Last Exam12.7 FrontierMath— ARC-AGI-2— Coding25% 43.8 SWE-bench Verified60.4 Terminal-Bench13.6 Aider Polyglot61.9 SciCode39.4 Tool use & agents25% 31.6 TAU-bench Retail— τ²-bench31.6 BFCL— BrowseComp— Full breakdown for Gemini 2.5 Flash
#8	Gemini 2.5 Pro Google	79.6	—	—	—	—	—	79.6	2025	—	multimodal	API only	—	2025	1M	85	0.70	$1.25	$10.00
#9	Nova Lite Amazon	78.5	—	—	86.8	92.4	—	56.2	2024	—	multimodal	API only	—	—	300K	100	0.50	$0.06	$0.24
#10	Llama 4 Maverick Meta	78.2	—	59.6	90	94.4	73.7	73.4	2025	—	multimodal	Open weights	400B total / 17B active (MoE)	2024	1M	639	0.20	$0.15	$0.60
#11	Grok-3 xAI	78	—	—	—	—	—	78	2025	—	multimodal	API only	—	2024	128K	100	0.70	$3.00	$15.00
#12	GPT-4o OpenAI	77.7	94.2	59.9	85.7	92.8	61.4	72.2	2024	—	multimodal	API only	—	2023	128K	132	0.50	$2.50	$10.00
#13	Claude Opus 4.6 Anthropic	77.3	—	77.3	—	—	—	—	2026	—	llm	API only	—	—	1M	48	1.65	$5.00	$25.00
#14	Claude 3.7 Sonnet Anthropic	75	—	—	—	—	—	75	2025	—	llm	API only	—	—	200K	101	0.40	$3.00	$15.00
#15	o1 OpenAI	74.7	—	—	—	—	71.8	77.6	2024	—	llm	API only	—	2023	200K	66	0.54	$15.00	$60.00
#16	Claude Sonnet 4 Anthropic	74.4	—	—	—	—	—	74.4	2025	—	llm	API only	—	2025	1M	101	0.40	$3.00	$15.00
#17	GPT-4.5 OpenAI	73.8	—	—	—	—	72.3	75.2	2025	—	multimodal	API only	—	—	128K	50	20.00	$75.00	$150.00
#18	GPT-4.1 OpenAI	73.5	—	—	—	—	72.2	74.8	2025	—	multimodal	API only	—	2024	1M	100	10.00	$2.00	$8.00
#19	GPT-4.1 Mini OpenAI	72.9	—	—	—	—	73.1	72.7	2025	—	multimodal	API only	—	2024	1M	150	5.00	$0.40	$1.60
#20	Gemini 2.5 Flash Lite Google	72.9	—	—	—	—	—	72.9	2025	—	multimodal	API only	—	2025	1M	6	0.44	$0.10	$0.40
#21	Gemini 2.0 Flash Google	70.7	—	—	—	—	—	70.7	2024	—	multimodal	API only	—	2024	1M	183	0.40	$0.10	$0.40
#22	Llama 3.2 11B Instruct Meta	66.4	91.1	33	83.4	88.4	51.5	50.7	2024	—	multimodal	Open weights	10600000000	2023	128K	168	0.20	$0.05	$0.05
#23	GPT-4.1 Nano OpenAI	55.8	—	—	—	—	56.2	55.4	2025	—	multimodal	API only	—	2024	1M	200	2.00	$0.10	$0.40

Ranked on Multimodal. Cell colors show relative standing within each column (red → yellow → green). Scores are curated approximations — see each model for sources.