AI models
Every model we track — frontier flagships, open-weights specialists, narrow benchmarks-only releases. Filter by lab, country, access, modality, or release window. Sorted by newest by default; head to the leaderboard for the ranked view.
Updated May 30, 2026 · Benchmarks via Artificial Analysis, specs & pricing via OpenRouter · Methodology · Spotted an error?
| Rank | Model | Index | General | Reason | Coding | Agents | Math | Multi | Long ctx | GPQA Diamond | DROP | ARC-AGI-2 | BIG-Bench Hard | SciCode | Terminal-Bench | LiveCodeBench | SWE-bench Verified | Aider Polyglot | HumanEval | Aider Polyglot Edit | MBPP | MultiPL-E | SWE-bench Pro | AIME 2025 | MATH-500 | AIME 2024 | MATH | GSM8K | MGSM | HMMT 2025 | FrontierMath | τ²-bench | TAU-bench Retail | TAU-bench Airline | BFCL | BrowseComp | τ²-bench Airline | τ²-bench Retail | MMMU | MathVista | ChartQA | DocVQA | MMMU-Pro | AI2D | Humanity’s Last Exam | MMLU-Pro | MMLU | IFEval | SimpleQA | Multi-IF | LiveBench | Arena Hard | AA-LCR | LongBench-v2 | Released ↓ | Country | Type | Access | Params | Cutoff | Context | Speed | Latency | In $/M | Out $/M |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| #26 | InclusionAI | 42 | 25 | 32.8 | 24.2 | 86 | — | — | 25 | 59.3 | — | — | — | 27.1 | 21.2 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 86 | — | — | — | — | — | — | — | — | — | — | — | — | 6.2 | — | — | — | — | — | — | — | 25 | — | 2026 | — | llm | API only | — | — | 262K | — | — | $0.01 | $0.03 |
| #27 | 69.5 | 69.7 | 63.5 | 48.7 | 95.9 | — | — | 69.7 | 91.1 | — | — | — | 53.5 | 43.9 | — | — | — | — | — | — | — | 58.6 | — | — | — | — | — | — | — | — | 95.9 | — | — | — | — | — | — | — | — | — | — | — | — | 35.9 | — | — | — | — | — | — | — | 69.7 | — | 2026 | — | llm | Open weights | 1T (32B active) | — | 262K | 57 | 1.20 | $0.68 | $3.42 | |
| #28 | 72.8 | 70.3 | 66.9 | 65.5 | 88.6 | — | — | 70.3 | 94.2 | — | — | — | 54.5 | 54.5 | — | 87.6 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 88.6 | — | — | — | — | — | — | — | — | — | — | — | — | 39.6 | — | — | — | — | — | — | — | 70.3 | — | 2026 | — | llm | API only | — | — | 1M | 49 | 1.42 | $5.00 | $25.00 | |
| #29 | China Mobile | 41.1 | 11.7 | 37.1 | 22.7 | 93 | — | — | 11.7 | 67.6 | — | — | — | 27.2 | 18.2 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 93 | — | — | — | — | — | — | — | — | — | — | — | — | 6.6 | — | — | — | — | — | — | — | 11.7 | — | 2026 | — | llm | — | — | — | — | — | — | $0.00 | $0.00 |
| #30 | 49.3 | 49.3 | 45.5 | 24.3 | 78.1 | — | — | 49.3 | 79.4 | — | — | — | 28 | 20.5 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 78.1 | — | — | — | — | — | — | — | — | — | — | — | — | 11.6 | — | — | — | — | — | — | — | 49.3 | — | 2026 | — | llm | — | — | — | — | — | — | $0.00 | $0.00 | |
| #31 | 68.5 | 69.7 | 64.2 | 48.5 | 91.5 | — | — | 69.7 | 88.4 | — | — | — | 51.5 | 45.5 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 91.5 | — | — | — | — | — | — | — | — | — | — | — | — | 39.9 | — | — | — | — | — | — | — | 69.7 | — | 2026 | — | multimodal | API only | — | — | — | — | — | $0.00 | $0.00 | |
| #32 | 65.2 | 62.3 | 57.4 | 43.5 | 97.7 | — | — | 62.3 | 86.8 | — | — | — | 43.8 | 43.2 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 97.7 | — | — | — | — | — | — | — | — | — | — | — | — | 28 | — | — | — | — | — | — | — | 62.3 | — | 2026 | — | llm | Open weights | — | — | 203K | 53 | 0.78 | $0.98 | $3.08 | |
| #33 | 63.6 | 58 | 61.7 | 41.8 | 93 | — | — | 58 | 91.1 | — | — | — | 45.6 | 37.9 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 93 | — | — | — | — | — | — | — | — | — | — | — | — | 32.2 | — | — | — | — | — | — | — | 58 | — | 2026 | — | llm | — | — | — | — | 105 | 0.70 | $2.00 | $6.00 | |
| #34 | 45.2 | 55.7 | 48.8 | 32.5 | 43.6 | — | — | 55.7 | 79.2 | — | — | — | 40 | 25 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 43.6 | — | — | — | — | — | — | — | — | — | — | — | — | 18.3 | — | — | — | — | — | — | — | 55.7 | — | 2026 | — | multimodal | Open weights | — | — | 262K | 66 | 0.71 | $0.06 | $0.33 | |
| #35 | 26.1 | 30.7 | 31.2 | 16.4 | 26 | — | — | 30.7 | 57.6 | — | — | — | 24.4 | 8.3 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 26 | — | — | — | — | — | — | — | — | — | — | — | — | 4.7 | — | — | — | — | — | — | — | 30.7 | — | 2026 | — | llm | — | — | — | — | — | — | $0.00 | $0.00 | |
Index 26.1 = (30.7 + 31.2 + 16.4 + 26.0 / 4) — equal-weighted mean of 4 components. General25% 30.7
Reasoning25% 31.2
Coding25% 16.4
Tool use & agents25% 26
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #36 | 66.7 | 69.7 | 57 | 42.3 | 97.7 | — | — | 69.7 | 88.2 | — | — | — | 40.7 | 43.9 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 97.7 | — | — | — | — | — | — | — | — | — | — | — | — | 25.7 | — | — | — | — | — | — | — | 69.7 | — | 2026 | — | multimodal | API only | — | — | 1M | 52 | 1.73 | $0.33 | $1.95 | |
| #37 | StepFun | 57.5 | 54.3 | 52.6 | 35.6 | 87.4 | — | — | 54.3 | 82.6 | — | — | — | 38.5 | 32.6 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 87.4 | — | — | — | — | — | — | — | — | — | — | — | — | 22.6 | — | — | — | — | — | — | — | 54.3 | — | 2026 | — | llm | — | — | — | — | 197 | 0.90 | $0.00 | $0.00 |
| #38 | 55.4 | 62 | 54.2 | 39.9 | 65.5 | — | — | 62 | 85.7 | — | — | — | 43.4 | 36.4 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 65.5 | — | — | — | — | — | — | — | — | — | — | — | — | 22.7 | — | — | — | — | — | — | — | 62 | — | 2026 | — | multimodal | Open weights | — | — | 262K | 36 | 0.79 | $0.12 | $0.37 | |
| #39 | 18.3 | 15 | 24 | 12 | 22.2 | — | — | 15 | 43.3 | — | — | — | 20.9 | 3 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 22.2 | — | — | — | — | — | — | — | — | — | — | — | — | 4.8 | — | — | — | — | — | — | — | 15 | — | 2026 | — | llm | — | — | — | — | — | — | $0.00 | $0.00 | |
| #40 | 61.5 | 61 | 48.4 | 38.1 | 98.5 | — | — | 61 | 80.9 | — | — | — | 43.5 | 32.6 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 98.5 | — | — | — | — | — | — | — | — | — | — | — | — | 15.8 | — | — | — | — | — | — | — | 61 | — | 2026 | — | multimodal | API only | — | — | 203K | — | — | $1.20 | $4.00 | |
| #41 | 49.4 | 33 | 45 | 29.4 | 90.1 | — | — | 33 | 75.2 | — | — | — | 36.1 | 22.7 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 90.1 | — | — | — | — | — | — | — | — | — | — | — | — | 14.7 | — | — | — | — | — | — | — | 33 | — | 2026 | — | llm | Open weights | — | — | 262K | 129 | 0.61 | $0.22 | $0.85 | |
| #42 | 55.1 | 52.7 | 48.3 | 30.9 | 88.3 | — | — | 52.7 | 82.6 | — | — | — | 40.5 | 21.2 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 88.3 | — | — | — | — | — | — | — | — | — | — | — | — | 13.9 | — | — | — | — | — | — | — | 52.7 | — | 2026 | — | llm | — | — | — | — | 54 | 1.28 | $0.40 | $4.80 | |
| #43 | 46.5 | 44 | 40.7 | 16.9 | 84.5 | — | — | 44 | 74.2 | — | — | — | 25.5 | 8.3 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 84.5 | — | — | — | — | — | — | — | — | — | — | — | — | 7.1 | — | — | — | — | — | — | — | 44 | — | 2026 | — | llm | — | — | — | — | 235 | 0.99 | $0.10 | $0.80 | |
| #44 | 62.5 | 66 | 50.8 | 43.8 | 89.5 | — | — | 66 | 85.5 | — | — | — | 38.3 | 49.2 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 89.5 | — | — | — | — | — | — | — | — | — | — | — | — | 16 | — | — | — | — | — | — | — | 66 | — | 2026 | — | llm | API only | — | — | 256K | 108 | 1.36 | $0.30 | $1.20 | |
| #45 | 60.6 | 63.7 | 53 | 37.6 | 88 | — | — | 63.7 | 85.5 | — | — | — | 39.5 | 35.6 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 88 | — | — | — | — | — | — | — | — | — | — | — | — | 20.4 | — | — | — | — | — | — | — | 63.7 | — | 2026 | — | llm | — | — | — | — | 110 | 1.51 | $0.40 | $2.00 | |
| #46 | 39.7 | 34 | 43.6 | 28 | 53.2 | — | — | 34 | 75.8 | — | — | — | 34.8 | 21.2 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 53.2 | — | — | — | — | — | — | — | — | — | — | — | — | 11.4 | — | — | — | — | — | — | — | 34 | — | 2026 | — | llm | — | — | — | — | — | — | $0.00 | $0.00 | |
| #47 | 63.8 | 60.7 | 57.7 | 41.7 | 95 | — | — | 60.7 | 87 | — | — | — | 42.5 | 40.9 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 95 | — | — | — | — | — | — | — | — | — | — | — | — | 28.3 | — | — | — | — | — | — | — | 60.7 | — | 2026 | — | llm | API only | — | — | 1M | 60 | 2.01 | $1.00 | $3.00 | |
| #48 | 63.6 | 68.7 | 57.8 | 43.2 | 84.8 | — | — | 68.7 | 87.4 | — | — | — | 47 | 39.4 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 84.8 | — | — | — | — | — | — | — | — | — | — | — | — | 28.1 | — | — | — | — | — | — | — | 68.7 | — | 2026 | — | llm | Open weights | — | — | 205K | 50 | 1.32 | $0.28 | $1.20 | |
Index 63.6 = (68.7 + 57.8 + 43.2 + 84.8 / 4) — equal-weighted mean of 4 components. General25% 68.7
Reasoning25% 57.8
Coding25% 43.2
Tool use & agents25% 84.8
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #49 | 61.3 | 66.7 | 51.4 | 35.8 | 91.2 | — | — | 66.7 | 82.8 | — | — | — | 36.7 | 34.8 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 91.2 | — | — | — | — | — | — | — | — | — | — | — | — | 19.9 | — | — | — | — | — | — | — | 66.7 | — | 2026 | — | multimodal | API only | — | — | 262K | 108 | 1.36 | $0.40 | $2.00 | |
| #50 | 65.2 | 69.3 | 57.1 | 51.1 | 83.3 | — | — | 69.3 | 87.5 | — | — | — | 49.9 | 52.3 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 83.3 | — | — | — | — | — | — | — | — | — | — | — | — | 26.6 | — | — | — | — | — | — | — | 69.3 | — | 2026 | — | llm | API only | — | 2025 | 400K | 162 | 0.63 | $0.75 | $4.50 | |
Score columns under Index are the v1.2 weighted components (25% each) that feed it. Reference per-category averages (not in the index) follow. Every individual benchmark in our catalog is also shown — grouped by category, ordered by coverage. Hover any header for details — click to sort. Cell colors show relative standing within each column (red → yellow → green). Scores are curated approximations — see each model for sources.