298 models in catalog

AI models

Every model we track — frontier flagships, open-weights specialists, narrow benchmarks-only releases. Filter by lab, country, access, modality, or release window. Sorted by newest by default; head to the leaderboard for the ranked view.

Leaderboard →Labs →Benchmarks →

Updated May 30, 2026 · Benchmarks via Artificial Analysis, specs & pricing via OpenRouter · Methodology · Spotted an error?

Overview Reasoning Coding Math Agents Multimodal General Long Context

Rank	Model	Index	General	Reason	Coding	Agents	Math	Multi	Long ctx	GPQA Diamond	DROP	ARC-AGI-2	BIG-Bench Hard	SciCode	Terminal-Bench	LiveCodeBench	SWE-bench Verified	Aider Polyglot	HumanEval	Aider Polyglot Edit	MBPP	MultiPL-E	SWE-bench Pro	AIME 2025	MATH-500	AIME 2024	MATH	GSM8K	MGSM	HMMT 2025	FrontierMath	τ²-bench	TAU-bench Retail	TAU-bench Airline	BFCL	BrowseComp	τ²-bench Airline	τ²-bench Retail	MMMU	MathVista	ChartQA	DocVQA	MMMU-Pro	AI2D	Humanity’s Last Exam	MMLU-Pro	MMLU	IFEval	SimpleQA	Multi-IF	LiveBench	Arena Hard	AA-LCR	LongBench-v2	Released ↓	Country	Type	Access	Params	Cutoff	Context	Speed	Latency	In $/M	Out $/M
#276	o1 OpenAI	47.1	53.2	30.4	37.9	66.7	58.9	74.7	59.3	78	—	—	—	35.8	12.9	67.9	41	61.7	88.1	—	—	—	—	—	97	74.3	96.4	97.1	89.3	—	5.5	62.6	70.8	50	—	—	—	—	77.6	71.8	—	—	—	—	7.7	84.1	92	—	47	—	67	—	59.3	—	2024	—	llm	API only	—	2023	200K	66	0.54	$15.00	$60.00
#277	OLMo 2 7B Allen Institute for AI	4.8	0	17.2	1.9	0	0.7	—	0	28.8	—	—	—	3.7	0	4.1	—	—	—	—	—	—	—	0.7	—	—	—	—	—	—	—	0	—	—	—	—	—	—	—	—	—	—	—	—	5.5	28.2	—	—	—	—	—	—	0	—	2024	—	llm	—	—	—	—	—	—	$0.00	$0.00
#278	Nova Pro Amazon	24.7	19	25.2	13.5	41.2	42.8	81.5	19	46.9	85.4	—	86.9	20.8	6.1	23.3	—	—	89	—	—	—	—	7	78.6	—	76.6	94.8	—	—	—	14	—	—	68.4	—	—	—	61.7	—	89.2	93.5	—	—	3.4	69.1	85.9	92.1	—	—	—	—	19	—	2024	—	multimodal	API only	—	—	300K	100	0.50	$0.80	$3.20
#279	Nova Lite Amazon	22.6	17.7	23.3	7.4	42.1	41.8	78.5	17.7	42	80.2	—	82.4	13.9	0.8	16.7	—	—	85.4	—	—	—	—	7	76.5	—	73.3	94.5	—	—	—	17.5	—	—	66.6	—	—	—	56.2	—	86.8	92.4	—	—	4.6	59	80.5	89.7	—	—	—	—	17.7	—	2024	—	multimodal	API only	—	—	300K	100	0.50	$0.06	$0.24
#280	Nova Micro Amazon	18.2	9.7	22.4	5.5	35.1	38.2	—	9.7	40	79.3	—	79.5	9.4	1.5	14	—	—	81.1	—	—	—	—	6	70.3	—	69.3	92.3	—	—	—	14	—	—	56.2	—	—	—	—	—	—	—	—	—	4.7	53.1	77.6	87.2	—	—	—	—	9.7	—	2024	—	llm	API only	—	—	128K	100	0.50	$0.03	$0.14
#281	Pixtral Large Mistral AI	25.8	10.3	27.1	29.2	36.5	36.9	81.7	10.3	50.5	—	—	—	29.2	—	26.1	—	—	—	—	—	—	—	2.3	71.4	—	—	—	—	—	—	36.5	—	—	—	—	—	—	64	69.4	88.1	93.3	—	93.8	3.6	70.1	—	—	—	—	—	—	10.3	—	2024	—	multimodal	API only	—	2024	131K	0	0.50	$2.00	$6.00
#282	Claude 3.5 Haiku Anthropic	27.1	23.3	22.6	24.6	37.8	72.1	—	23.3	41.6	83.1	—	—	27.4	2.3	31.4	40.6	28	88.1	—	—	—	—	—	72.1	—	69.4	—	85.6	—	—	24.6	51	22.8	—	—	—	—	—	—	—	—	—	—	3.5	65	80.9	—	—	—	—	—	23.3	—	2024	—	llm	API only	—	2024	200K	104	0.30	$0.80	$4.00
#283	Llama 3.1 Nemotron 70B Instruct NVIDIA	17.4	7	25.6	13.9	23.1	42.2	—	7	46.5	—	—	—	23.3	4.5	16.9	—	—	—	—	—	—	—	11	73.3	—	—	91.4	—	—	—	23.1	—	—	—	—	—	—	—	—	—	—	—	—	4.6	69	80.2	—	—	—	—	—	7	—	2024	—	llm	Open weights	70000000000	2023	—	292	0.24	$1.20	$1.20
#284	Llama 3.2 11B Instruct Meta	12.8	11.7	19	6	14.6	26.7	66.4	11.7	32.8	—	—	—	11.2	0.8	11	—	—	—	—	—	—	—	1.7	51.6	—	51.9	—	68.9	—	—	14.6	—	—	—	—	—	—	50.7	51.5	83.4	88.4	33	91.1	5.2	46.4	73	—	—	—	—	—	11.7	—	2024	—	multimodal	Open weights	10600000000	2023	128K	168	0.20	$0.05	$0.05
#285	Llama 3.2 3B Instruct Meta	11.8	2	19	5.2	21.1	26.1	—	2	32.8	—	—	—	5.2	—	8.3	—	—	—	—	—	—	—	3.3	48.9	—	48	77.7	58.2	—	—	21.1	—	—	—	—	—	—	—	—	—	—	—	—	5.2	34.7	63.4	77.4	—	—	—	—	2	—	2024	—	llm	Open weights	—	2023	131K	172	0.24	$0.05	$0.34
#286	Llama 3.2 1B Instruct Meta	4.6	5	12.5	0.9	0	7	—	5	19.6	—	—	—	1.7	0	1.9	—	—	—	—	—	—	—	0	14	—	—	—	—	—	—	0	—	—	—	—	—	—	—	—	—	—	—	—	5.3	20	—	—	—	—	—	—	5	—	2024	—	llm	Open weights	—	2023	131K	91	0.60	$0.03	$0.20
#287	Molmo 7B-D Allen Institute for AI	4.1	0	14.6	1.8	0	0	—	0	24	—	—	—	3.6	0	3.9	—	—	—	—	—	—	—	0	—	—	—	—	—	—	—	0	—	—	—	—	—	—	—	—	—	—	—	—	5.1	37.1	—	—	—	—	—	—	0	—	2024	—	llm	—	—	—	—	—	—	$0.00	$0.00
Index 4.1 = (0.0 + 14.6 + 1.8 + 0.0 / 4) — equal-weighted mean of 4 components. General25% 0 SimpleQA— AA-LCR0 LongBench-v2— IFBench— Reasoning25% 14.6 GPQA Diamond24 Humanity’s Last Exam5.1 FrontierMath— ARC-AGI-2— Coding25% 1.8 SWE-bench Verified— Terminal-Bench0 Aider Polyglot— SciCode3.6 Tool use & agents25% 0 TAU-bench Retail— τ²-bench0 BFCL— BrowseComp— Full breakdown for Molmo 7B-D
#288	Qwen2.5 72B Instruct Alibaba	24.3	20.3	26.6	15.6	34.5	49.9	—	20.3	49	—	—	—	26.7	4.5	55.5	—	—	86.6	—	88.2	75.1	—	14	85.8	—	83.1	95.8	—	—	—	34.5	—	—	—	—	—	—	—	—	—	—	—	—	4.2	71.1	—	84.1	—	—	52.3	81.2	20.3	—	2024	—	llm	Open weights	—	2024	131K	100	0.37	$0.36	$0.40
#289	Mistral Large 2 Mistral AI	20.6	5.3	26.3	17.7	33	43.8	—	5.3	48.6	—	—	—	29.2	6.1	29.3	—	—	92	—	—	—	—	14	73.6	—	—	93	—	—	—	33	—	—	—	—	—	—	—	—	—	—	—	—	4	69.7	84	—	—	—	—	—	5.3	—	2024	—	llm	Open weights	123B	—	128K	42	0.40	$2.00	$6.00
#290	Llama 3.1 405B Instruct Meta	31	24.3	27.5	18.3	53.8	36.7	—	24.3	50.7	84.8	—	—	29.9	6.8	30.5	—	—	89	—	—	—	—	3	70.3	—	73.8	96.8	—	—	—	19	—	—	88.5	—	—	—	—	—	—	—	—	—	4.2	73.3	87.3	88.6	—	—	—	—	24.3	—	2024	—	llm	Open weights	405000000000	—	128K	100	0.40	$0.89	$0.89
#291	Llama 3.1 70B Instruct Meta	23.6	6.3	23.2	14.9	50	34.5	—	6.3	41.7	79.6	—	—	26.7	3	23.2	—	—	80.5	—	—	—	—	4	64.9	—	—	—	—	—	—	15.2	—	—	84.8	—	—	—	—	—	—	—	—	—	4.6	66.4	83.6	87.5	—	—	—	—	6.3	—	2024	—	llm	Open weights	—	2023	131K	1204	0.20	$0.40	$0.40
#292	Llama 3.1 8B Instruct Meta	21.7	15.7	17.8	7	46.3	28.1	—	15.7	30.4	59.5	—	—	13.2	0.8	11.6	—	—	72.6	—	—	—	—	4.3	51.9	—	—	—	—	—	—	16.4	—	—	76.1	—	—	—	—	—	—	—	—	—	5.1	48.3	69.4	80.4	—	—	—	—	15.7	—	2024	—	llm	Open weights	—	2023	131K	2047	0.20	$0.02	$0.05
#293	GPT-4o OpenAI	38.8	45.6	37.7	27.2	44.6	42.7	77.7	53	70.1	—	—	—	36.6	8.3	42.5	33.2	30.7	90.2	18.2	—	—	—	25.7	89.3	13.1	—	—	—	—	—	28.9	60.3	42.8	—	—	45.5	63.4	72.2	61.4	85.7	92.8	59.9	94.2	5.3	74.7	88.7	81	38.2	60.9	—	—	53	—	2024	—	multimodal	API only	—	2023	128K	132	0.50	$2.50	$10.00
#294	Phi-3 Mini Instruct 3.8B Microsoft	6.2	2	18.2	4.5	0	23	—	2	31.9	—	—	—	9	0	11.6	—	—	—	—	—	—	—	0.3	45.7	—	—	—	—	—	—	0	—	—	—	—	—	—	—	—	—	—	—	—	4.4	43.5	—	—	—	—	—	—	2	—	2024	—	llm	—	—	—	—	—	—	$0.00	$0.00
#295	Llama 3 70B Instruct Meta	7.8	0	21.2	9.9	0	48.3	—	0	37.9	—	—	—	18.9	0.8	19.8	—	—	—	—	—	—	—	—	48.3	—	—	—	—	—	—	0	—	—	—	—	—	—	—	—	—	—	—	—	4.4	57.4	—	—	—	—	—	—	0	—	2024	—	llm	Open weights	—	2023	8K	45	0.70	$0.51	$0.74
#296	Llama 3 8B Instruct Meta	5.9	0	17.4	6	0	49.9	—	0	29.6	—	—	—	11.9	0	9.6	—	—	—	—	—	—	—	—	49.9	—	—	—	—	—	—	0	—	—	—	—	—	—	—	—	—	—	—	—	5.1	40.5	—	—	—	—	—	—	0	—	2024	—	llm	Open weights	—	2023	8K	81	0.51	$0.04	$0.04
#297	Claude 3 Haiku Anthropic	17.6	21	18.6	9.7	21.1	39.4	—	21	33.3	78.4	—	73.7	18.6	0.8	15.4	—	—	75.9	—	—	—	—	—	39.4	—	38.9	88.9	75.1	—	—	21.1	—	—	—	—	—	—	—	—	—	—	—	—	3.9	—	75.2	—	—	—	—	—	21	—	2024	—	multimodal	API only	—	2023	200K	104	0.40	$0.25	$1.25
#298	Mistral 7B Instruct Mistral AI	3.4	0	11	2.4	0	12.1	—	0	17.7	—	—	—	2.4	—	4.6	—	—	—	—	—	—	—	—	12.1	—	—	—	—	—	—	0	—	—	—	—	—	—	—	—	—	—	—	—	4.3	24.5	—	—	—	—	—	—	0	—	2023	—	llm	—	—	—	—	90	0.39	$0.20	$0.20

Score columns under Index are the v1.2 weighted components (25% each) that feed it. Reference per-category averages (not in the index) follow. Every individual benchmark in our catalog is also shown — grouped by category, ordered by coverage. Hover any header for details — click to sort. Cell colors show relative standing within each column (red → yellow → green). Scores are curated approximations — see each model for sources.