298 models in catalog

AI models

Every model we track — frontier flagships, open-weights specialists, narrow benchmarks-only releases. Filter by lab, country, access, modality, or release window. Sorted by newest by default; head to the leaderboard for the ranked view.

Leaderboard →Labs →Benchmarks →

Updated May 29, 2026 · Benchmarks via Artificial Analysis, specs & pricing via OpenRouter · Methodology · Spotted an error?

Overview Reasoning Coding Math Agents Multimodal General Long Context

Rank	Model	Index	General	Reason	Coding	Agents	Math	Multi	Long ctx	GPQA Diamond	DROP	ARC-AGI-2	BIG-Bench Hard	SciCode	Terminal-Bench	LiveCodeBench	SWE-bench Verified	Aider Polyglot	HumanEval	Aider Polyglot Edit	MBPP	MultiPL-E	SWE-bench Pro	AIME 2025	MATH-500	AIME 2024	MATH	GSM8K	MGSM	HMMT 2025	FrontierMath	τ²-bench	TAU-bench Retail	TAU-bench Airline	BFCL	BrowseComp	τ²-bench Airline	τ²-bench Retail	MMMU	MathVista	ChartQA	DocVQA	MMMU-Pro	AI2D	Humanity’s Last Exam	MMLU-Pro	MMLU	IFEval	SimpleQA	Multi-IF	LiveBench	Arena Hard	AA-LCR	LongBench-v2	Released ↓	Country	Type	Access	Params	Cutoff	Context	Speed	Latency	In $/M	Out $/M
#26	Ling-2.6-flash InclusionAI	42	25	32.8	24.2	86	—	—	25	59.3	—	—	—	27.1	21.2	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	86	—	—	—	—	—	—	—	—	—	—	—	—	6.2	—	—	—	—	—	—	—	25	—	2026	—	llm	API only	—	—	262K	—	—	$0.01	$0.03
#27	Kimi K2.6 Moonshot AI	69.5	69.7	63.5	48.7	95.9	—	—	69.7	91.1	—	—	—	53.5	43.9	—	—	—	—	—	—	—	58.6	—	—	—	—	—	—	—	—	95.9	—	—	—	—	—	—	—	—	—	—	—	—	35.9	—	—	—	—	—	—	—	69.7	—	2026	—	llm	Open weights	1T (32B active)	—	262K	57	1.20	$0.73	$3.49
#28	Claude Opus 4.7 Anthropic	72.8	70.3	66.9	65.5	88.6	—	—	70.3	94.2	—	—	—	54.5	54.5	—	87.6	—	—	—	—	—	—	—	—	—	—	—	—	—	—	88.6	—	—	—	—	—	—	—	—	—	—	—	—	39.6	—	—	—	—	—	—	—	70.3	—	2026	—	llm	API only	—	—	1M	49	1.42	$5.00	$25.00
#29	JT-MINI China Mobile	41.1	11.7	37.1	22.7	93	—	—	11.7	67.6	—	—	—	27.2	18.2	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	93	—	—	—	—	—	—	—	—	—	—	—	—	6.6	—	—	—	—	—	—	—	11.7	—	2026	—	llm	—	—	—	—	—	—	$0.00	$0.00
#30	EXAONE 4.5 33B LG AI Research	49.3	49.3	45.5	24.3	78.1	—	—	49.3	79.4	—	—	—	28	20.5	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	78.1	—	—	—	—	—	—	—	—	—	—	—	—	11.6	—	—	—	—	—	—	—	49.3	—	2026	—	llm	—	—	—	—	—	—	$0.00	$0.00
#31	Muse Spark Meta	68.5	69.7	64.2	48.5	91.5	—	—	69.7	88.4	—	—	—	51.5	45.5	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	91.5	—	—	—	—	—	—	—	—	—	—	—	—	39.9	—	—	—	—	—	—	—	69.7	—	2026	—	multimodal	API only	—	—	—	—	—	$0.00	$0.00
#32	GLM 5.1 Zhipu AI	65.2	62.3	57.4	43.5	97.7	—	—	62.3	86.8	—	—	—	43.8	43.2	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	97.7	—	—	—	—	—	—	—	—	—	—	—	—	28	—	—	—	—	—	—	—	62.3	—	2026	—	llm	Open weights	—	—	203K	53	0.78	$0.98	$3.08
#33	Grok 4.20 0309 v2 xAI	63.6	58	61.7	41.8	93	—	—	58	91.1	—	—	—	45.6	37.9	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	93	—	—	—	—	—	—	—	—	—	—	—	—	32.2	—	—	—	—	—	—	—	58	—	2026	—	llm	—	—	—	—	105	0.70	$2.00	$6.00
#34	Gemma 4 26B A4B Google	45.2	55.7	48.8	32.5	43.6	—	—	55.7	79.2	—	—	—	40	25	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	43.6	—	—	—	—	—	—	—	—	—	—	—	—	18.3	—	—	—	—	—	—	—	55.7	—	2026	—	multimodal	Open weights	—	—	262K	66	0.71	$0.06	$0.33
#35	Gemma 4 E4B Google	26.1	30.7	31.2	16.4	26	—	—	30.7	57.6	—	—	—	24.4	8.3	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	26	—	—	—	—	—	—	—	—	—	—	—	—	4.7	—	—	—	—	—	—	—	30.7	—	2026	—	llm	—	—	—	—	—	—	$0.00	$0.00
#36	Qwen3.6 Plus Alibaba	66.7	69.7	57	42.3	97.7	—	—	69.7	88.2	—	—	—	40.7	43.9	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	97.7	—	—	—	—	—	—	—	—	—	—	—	—	25.7	—	—	—	—	—	—	—	69.7	—	2026	—	multimodal	API only	—	—	1M	52	1.73	$0.33	$1.95
#37	Step 3.5 Flash 2603 StepFun	57.5	54.3	52.6	35.6	87.4	—	—	54.3	82.6	—	—	—	38.5	32.6	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	87.4	—	—	—	—	—	—	—	—	—	—	—	—	22.6	—	—	—	—	—	—	—	54.3	—	2026	—	llm	—	—	—	—	197	0.90	$0.00	$0.00
#38	Gemma 4 31B Google	55.4	62	54.2	39.9	65.5	—	—	62	85.7	—	—	—	43.4	36.4	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	65.5	—	—	—	—	—	—	—	—	—	—	—	—	22.7	—	—	—	—	—	—	—	62	—	2026	—	multimodal	Open weights	—	—	262K	36	0.79	$0.12	$0.37
#39	Gemma 4 E2B Google	18.3	15	24	12	22.2	—	—	15	43.3	—	—	—	20.9	3	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	22.2	—	—	—	—	—	—	—	—	—	—	—	—	4.8	—	—	—	—	—	—	—	15	—	2026	—	llm	—	—	—	—	—	—	$0.00	$0.00
#40	GLM 5V Turbo Zhipu AI	61.5	61	48.4	38.1	98.5	—	—	61	80.9	—	—	—	43.5	32.6	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	98.5	—	—	—	—	—	—	—	—	—	—	—	—	15.8	—	—	—	—	—	—	—	61	—	2026	—	multimodal	API only	—	—	203K	—	—	$1.20	$4.00
#41	Trinity Large Thinking Arcee AI	49.4	33	45	29.4	90.1	—	—	33	75.2	—	—	—	36.1	22.7	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	90.1	—	—	—	—	—	—	—	—	—	—	—	—	14.7	—	—	—	—	—	—	—	33	—	2026	—	llm	Open weights	—	—	262K	129	0.61	$0.22	$0.85
#42	Qwen3.5 Omni Plus Alibaba	55.1	52.7	48.3	30.9	88.3	—	—	52.7	82.6	—	—	—	40.5	21.2	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	88.3	—	—	—	—	—	—	—	—	—	—	—	—	13.9	—	—	—	—	—	—	—	52.7	—	2026	—	llm	—	—	—	—	54	1.28	$0.40	$4.80
#43	Qwen3.5 Omni Flash Alibaba	46.5	44	40.7	16.9	84.5	—	—	44	74.2	—	—	—	25.5	8.3	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	84.5	—	—	—	—	—	—	—	—	—	—	—	—	7.1	—	—	—	—	—	—	—	44	—	2026	—	llm	—	—	—	—	235	0.99	$0.10	$0.80
Index 46.5 = (44.0 + 40.7 + 16.9 + 84.5 / 4) — equal-weighted mean of 4 components. General25% 44 SimpleQA— AA-LCR44 LongBench-v2— IFBench— Reasoning25% 40.7 GPQA Diamond74.2 Humanity’s Last Exam7.1 FrontierMath— ARC-AGI-2— Coding25% 16.9 SWE-bench Verified— Terminal-Bench8.3 Aider Polyglot— SciCode25.5 Tool use & agents25% 84.5 TAU-bench Retail— τ²-bench84.5 BFCL— BrowseComp— Full breakdown for Qwen3.5 Omni Flash
#44	KAT-Coder-Pro V2 Kuaishou	62.5	66	50.8	43.8	89.5	—	—	66	85.5	—	—	—	38.3	49.2	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	89.5	—	—	—	—	—	—	—	—	—	—	—	—	16	—	—	—	—	—	—	—	66	—	2026	—	llm	API only	—	—	256K	108	1.36	$0.30	$1.20
#45	MiMo-V2-Omni-0327 Xiaomi	60.6	63.7	53	37.6	88	—	—	63.7	85.5	—	—	—	39.5	35.6	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	88	—	—	—	—	—	—	—	—	—	—	—	—	20.4	—	—	—	—	—	—	—	63.7	—	2026	—	llm	—	—	—	—	110	1.51	$0.40	$2.00
#46	Nemotron Cascade 2 30B A3B NVIDIA	39.7	34	43.6	28	53.2	—	—	34	75.8	—	—	—	34.8	21.2	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	53.2	—	—	—	—	—	—	—	—	—	—	—	—	11.4	—	—	—	—	—	—	—	34	—	2026	—	llm	—	—	—	—	—	—	$0.00	$0.00
#47	MiMo-V2-Pro Xiaomi	63.8	60.7	57.7	41.7	95	—	—	60.7	87	—	—	—	42.5	40.9	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	95	—	—	—	—	—	—	—	—	—	—	—	—	28.3	—	—	—	—	—	—	—	60.7	—	2026	—	llm	API only	—	—	1M	60	2.01	$1.00	$3.00
#48	MiniMax M2.7 MiniMax	63.6	68.7	57.8	43.2	84.8	—	—	68.7	87.4	—	—	—	47	39.4	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	84.8	—	—	—	—	—	—	—	—	—	—	—	—	28.1	—	—	—	—	—	—	—	68.7	—	2026	—	llm	Open weights	—	—	205K	50	1.32	$0.28	$1.20
#49	MiMo-V2-Omni Xiaomi	61.3	66.7	51.4	35.8	91.2	—	—	66.7	82.8	—	—	—	36.7	34.8	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	91.2	—	—	—	—	—	—	—	—	—	—	—	—	19.9	—	—	—	—	—	—	—	66.7	—	2026	—	multimodal	API only	—	—	262K	108	1.36	$0.40	$2.00
#50	GPT-5.4 mini OpenAI	65.2	69.3	57.1	51.1	83.3	—	—	69.3	87.5	—	—	—	49.9	52.3	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	83.3	—	—	—	—	—	—	—	—	—	—	—	—	26.6	—	—	—	—	—	—	—	69.3	—	2026	—	llm	API only	—	2025	400K	162	0.63	$0.75	$4.50

Score columns under Index are the v1.2 weighted components (25% each) that feed it. Reference per-category averages (not in the index) follow. Every individual benchmark in our catalog is also shown — grouped by category, ordered by coverage. Hover any header for details — click to sort. Cell colors show relative standing within each column (red → yellow → green). Scores are curated approximations — see each model for sources.