298 models in catalog

AI models

Every model we track — frontier flagships, open-weights specialists, narrow benchmarks-only releases. Filter by lab, country, access, modality, or release window. Sorted by newest by default; head to the leaderboard for the ranked view.

Leaderboard →Labs →Benchmarks →

Updated May 30, 2026 · Benchmarks via Artificial Analysis, specs & pricing via OpenRouter · Methodology · Spotted an error?

Overview Reasoning Coding Math Agents Multimodal General Long Context

Rank	Model	Index	General	Reason	Coding	Agents	Math	Multi	Long ctx	GPQA Diamond	DROP	ARC-AGI-2	BIG-Bench Hard	SciCode	Terminal-Bench	LiveCodeBench	SWE-bench Verified	Aider Polyglot	HumanEval	Aider Polyglot Edit	MBPP	MultiPL-E	SWE-bench Pro	AIME 2025	MATH-500	AIME 2024	MATH	GSM8K	MGSM	HMMT 2025	FrontierMath	τ²-bench	TAU-bench Retail	TAU-bench Airline	BFCL	BrowseComp	τ²-bench Airline	τ²-bench Retail	MMMU	MathVista	ChartQA	DocVQA	MMMU-Pro	AI2D	Humanity’s Last Exam	MMLU-Pro	MMLU	IFEval	SimpleQA	Multi-IF	LiveBench	Arena Hard	AA-LCR	LongBench-v2	Released ↓	Country	Type	Access	Params	Cutoff	Context	Speed	Latency	In $/M	Out $/M
#51	GPT-5.4 nano OpenAI	60.2	66	54.1	44.7	76	—	—	66	81.7	—	—	—	46.9	42.4	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	76	—	—	—	—	—	—	—	—	—	—	—	—	26.5	—	—	—	—	—	—	—	66	—	2026	—	llm	API only	—	2025	400K	157	0.55	$0.20	$1.25
#52	Mistral Small 4 Mistral AI	39.2	44.7	43.2	27.7	41.2	—	—	44.7	76.9	—	—	—	38	17.4	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	41.2	—	—	—	—	—	—	—	—	—	—	—	—	9.5	—	—	—	—	—	—	—	44.7	—	2026	—	multimodal	Open weights	—	—	262K	145	0.51	$0.15	$0.60
#53	NVIDIA Nemotron 3 Nano 4B NVIDIA	21.1	16.7	28.1	11.6	28.1	—	—	16.7	51.3	—	—	—	16.4	6.8	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	28.1	—	—	—	—	—	—	—	—	—	—	—	—	4.8	—	—	—	—	—	—	—	16.7	—	2026	—	llm	—	—	—	—	—	—	$0.00	$0.00
#54	GLM 5 Turbo Zhipu AI	63.2	60.7	55.1	38.5	98.5	—	—	60.7	84.7	—	—	—	43.6	33.3	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	98.5	—	—	—	—	—	—	—	—	—	—	—	—	25.4	—	—	—	—	—	—	—	60.7	—	2026	—	llm	API only	—	—	203K	—	—	$1.20	$4.00
#55	NVIDIA Nemotron 3 Super 120B A12B NVIDIA	52.5	60	49.6	32.4	67.8	—	—	60	80	—	—	—	36	28.8	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	67.8	—	—	—	—	—	—	—	—	—	—	—	—	19.2	—	—	—	—	—	—	—	60	—	2026	—	llm	—	—	—	—	211	1.01	$0.30	$0.80
#56	Grok 4.20 0309 xAI	64.4	59	59.3	42.8	96.5	—	—	59	88.5	—	—	—	44.7	40.9	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	96.5	—	—	—	—	—	—	—	—	—	—	—	—	30	—	—	—	—	—	—	—	59	—	2026	—	llm	—	—	—	—	97	0.62	$2.00	$6.00
#57	Qwen3.5-9B Alibaba	54.7	59	46.9	26	86.8	—	—	59	80.6	—	—	—	27.7	24.2	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	86.8	—	—	—	—	—	—	—	—	—	—	—	—	13.3	—	—	—	—	—	—	—	59	—	2026	—	multimodal	Open weights	—	—	262K	51	0.33	$0.04	$0.15
#58	Sarvam 105B Sarvam	25.7	0	41.9	14	46.8	—	—	0	73.8	—	—	—	26.4	1.5	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	46.8	—	—	—	—	—	—	—	—	—	—	—	—	10.1	—	—	—	—	—	—	—	0	—	2026	—	llm	—	—	—	—	128	1.29	$0.00	$0.00
#59	Sarvam 30B Sarvam	20.1	0	35.2	10.8	34.5	—	—	0	63.3	—	—	—	19.2	2.3	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	34.5	—	—	—	—	—	—	—	—	—	—	—	—	7	—	—	—	—	—	—	—	0	—	2026	—	llm	—	—	—	—	214	1.17	$0.00	$0.00
#60	GPT-5.4 OpenAI	71.3	74	66.8	57.1	87.1	—	—	74	92	—	—	—	56.6	57.6	—	—	—	—	—	—	—	57.7	—	—	—	—	—	—	—	—	87.1	—	—	—	—	—	—	—	—	—	—	—	—	41.6	—	—	—	—	—	—	—	74	—	2026	—	llm	API only	—	—	1.1M	84	0.63	$2.50	$15.00
#61	Mercury 2 Inception	46.5	36.3	46.3	32.6	70.8	—	—	36.3	77	—	—	—	38.7	26.5	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	70.8	—	—	—	—	—	—	—	—	—	—	—	—	15.5	—	—	—	—	—	—	—	36.3	—	2026	—	llm	API only	—	—	128K	790	6.11	$0.25	$0.75
#62	Qwen3.5 4B Alibaba	52.1	55.7	42.4	18.3	92.1	—	—	55.7	77.1	—	—	—	18.3	18.2	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	92.1	—	—	—	—	—	—	—	—	—	—	—	—	7.8	—	—	—	—	—	—	—	55.7	—	2026	—	llm	—	—	—	—	164	0.24	$0.00	$0.20
#63	Qwen3.5 2B Alibaba	34	23.7	25.3	5.5	81.6	—	—	23.7	45.6	—	—	—	7.2	3.8	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	81.6	—	—	—	—	—	—	—	—	—	—	—	—	4.9	—	—	—	—	—	—	—	23.7	—	2026	—	llm	—	—	—	—	328	0.24	$0.00	$0.10
#64	Qwen3.5 0.8B Alibaba	21.9	6.7	14.3	1.5	65.2	—	—	6.7	23.6	—	—	—	2.9	0	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	65.2	—	—	—	—	—	—	—	—	—	—	—	—	4.9	—	—	—	—	—	—	—	6.7	—	2026	—	llm	—	—	—	—	120	0.23	$0.00	$0.10
#65	Qwen3.5-122B-A10B Alibaba	62.9	66.7	54.6	36.6	93.6	—	—	66.7	85.7	—	—	—	42	31.1	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	93.6	—	—	—	—	—	—	—	—	—	—	—	—	23.4	—	—	—	—	—	—	—	66.7	—	2026	—	multimodal	Open weights	—	—	262K	129	1.07	$0.26	$2.08
#66	Qwen3.5-27B Alibaba	62.8	67.3	54	36.1	93.9	—	—	67.3	85.8	—	—	—	39.5	32.6	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	93.9	—	—	—	—	—	—	—	—	—	—	—	—	22.2	—	—	—	—	—	—	—	67.3	—	2026	—	multimodal	Open weights	—	—	262K	91	1.48	$0.20	$1.56
#67	Qwen3.5-35B-A3B Alibaba	59	62.7	52.1	32.1	89.2	—	—	62.7	84.5	—	—	—	37.7	26.5	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	89.2	—	—	—	—	—	—	—	—	—	—	—	—	19.7	—	—	—	—	—	—	—	62.7	—	2026	—	multimodal	Open weights	—	—	262K	121	1.07	$0.14	$1.00
#68	LFM2-24B-A2B Liquid AI	10.6	0	25.9	5.5	11.1	—	—	0	47.4	—	—	—	10.9	0	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	11.1	—	—	—	—	—	—	—	—	—	—	—	—	4.4	—	—	—	—	—	—	—	0	—	2026	—	llm	Open weights	—	—	128K	208	0.30	$0.03	$0.12
#69	GPT-5.3-Codex OpenAI	69.7	74	65.7	53.1	86	—	—	74	91.5	—	—	—	53.2	53	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	86	—	—	—	—	—	—	—	—	—	—	—	—	39.9	—	—	—	—	—	—	—	74	—	2026	—	multimodal	API only	—	—	400K	73	81.08	$1.75	$14.00
#70	Gemini 3.1 Pro Google	76.2	72.7	71.9	64.4	95.6	—	—	72.7	94.3	—	77.1	—	58.9	53.8	—	80.6	—	—	—	—	—	—	—	—	—	—	—	—	—	—	95.6	—	—	—	—	—	—	—	—	—	—	—	—	44.4	—	—	—	—	—	—	—	72.7	—	2026	—	multimodal	API only	—	—	1M	142	26.02	$2.00	$12.00
#71	Claude Sonnet 4.6 Anthropic	67.2	70.7	58.6	59.8	79.5	—	—	70.7	87.5	—	58.3	—	46.9	53	—	79.6	—	—	—	—	—	—	—	—	—	—	—	—	—	—	79.5	—	—	—	—	—	—	—	—	—	—	—	—	30	—	—	—	—	—	—	—	70.7	—	2026	—	llm	API only	—	—	1M	75	1.13	$3.00	$15.00
#72	Tiny Aya Global Cohere	4.9	0	17.9	1.8	0	—	—	0	30.5	—	—	—	3.6	0	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	0	—	—	—	—	—	—	—	—	—	—	—	—	5.2	—	—	—	—	—	—	—	0	—	2026	—	llm	—	—	—	—	126	0.35	$0.00	$0.00
#73	Qwen3.5 397B A17B Alibaba	65.3	65.7	58.3	41.5	95.6	—	—	65.7	89.3	—	—	—	42	40.9	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	95.6	—	—	—	—	—	—	—	—	—	—	—	—	27.3	—	—	—	—	—	—	—	65.7	—	2026	—	multimodal	Open weights	—	—	262K	53	1.82	$0.39	$2.34
#74	MiniMax M2.5 MiniMax	63	66	52	38.7	95.3	—	—	66	84.8	—	—	—	42.6	34.8	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	95.3	—	—	—	—	—	—	—	—	—	—	—	—	19.1	—	—	—	—	—	—	—	66	—	2026	—	llm	Open weights	—	—	205K	87	1.16	$0.15	$1.15
Index 63 = (66.0 + 52.0 + 38.7 + 95.3 / 4) — equal-weighted mean of 4 components. General25% 66 SimpleQA— AA-LCR66 LongBench-v2— IFBench— Reasoning25% 52 GPQA Diamond84.8 Humanity’s Last Exam19.1 FrontierMath— ARC-AGI-2— Coding25% 38.7 SWE-bench Verified— Terminal-Bench34.8 Aider Polyglot— SciCode42.6 Tool use & agents25% 95.3 TAU-bench Retail— τ²-bench95.3 BFCL— BrowseComp— Full breakdown for MiniMax M2.5
#75	GLM-5 Zhipu AI	68.5	63.3	56.6	55.7	98.2	—	—	63.3	86	—	—	—	46.2	43.2	—	77.8	—	—	—	—	—	—	—	—	—	—	—	—	—	—	98.2	—	—	—	—	—	—	—	—	—	—	—	—	27.2	—	—	—	—	—	—	—	63.3	—	2026	—	llm	Open weights	744B (44B active)	—	203K	67	0.77	$0.60	$1.92

Score columns under Index are the v1.2 weighted components (25% each) that feed it. Reference per-category averages (not in the index) follow. Every individual benchmark in our catalog is also shown — grouped by category, ordered by coverage. Hover any header for details — click to sort. Cell colors show relative standing within each column (red → yellow → green). Scores are curated approximations — see each model for sources.