AI models
Every model we track — frontier flagships, open-weights specialists, narrow benchmarks-only releases. Filter by lab, country, access, modality, or release window. Sorted by newest by default; head to the leaderboard for the ranked view.
Updated May 30, 2026 · Benchmarks via Artificial Analysis, specs & pricing via OpenRouter · Methodology · Spotted an error?
| Rank | Model | Index | General | Reason | Coding | Agents | Math | Multi | Long ctx | GPQA Diamond | DROP | ARC-AGI-2 | BIG-Bench Hard | SciCode | Terminal-Bench | LiveCodeBench | SWE-bench Verified | Aider Polyglot | HumanEval | Aider Polyglot Edit | MBPP | MultiPL-E | SWE-bench Pro | AIME 2025 | MATH-500 | AIME 2024 | MATH | GSM8K | MGSM | HMMT 2025 | FrontierMath | τ²-bench | TAU-bench Retail | TAU-bench Airline | BFCL | BrowseComp | τ²-bench Airline | τ²-bench Retail | MMMU | MathVista | ChartQA | DocVQA | MMMU-Pro | AI2D | Humanity’s Last Exam | MMLU-Pro | MMLU | IFEval | SimpleQA | Multi-IF | LiveBench | Arena Hard | AA-LCR | LongBench-v2 | Released ↓ | Country | Type | Access | Params | Cutoff | Context | Speed | Latency | In $/M | Out $/M |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| #101 | 61.9 | 64.3 | 52.9 | 35.3 | 95 | 96.3 | — | 64.3 | 84.6 | — | — | — | 39.4 | 31.1 | 86.8 | — | — | — | — | — | — | — | 96.3 | — | — | — | — | — | — | — | 95 | — | — | — | — | — | — | — | — | — | — | — | — | 21.1 | 84.3 | — | — | — | — | — | — | 64.3 | — | 2025 | — | llm | Open weights | — | — | 262K | 145 | 1.34 | $0.10 | $0.30 | |
Index 61.9 = (64.3 + 52.9 + 35.3 + 95.0 / 4) — equal-weighted mean of 4 components. General25% 64.3
Reasoning25% 52.9
Coding25% 35.3
Tool use & agents25% 95
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #102 | 11.8 | 0 | 32.6 | 14.7 | 0 | 77.3 | — | 0 | 59.1 | — | — | — | 29.3 | 0 | 69.5 | — | — | — | — | — | — | — | 77.3 | — | — | — | — | — | — | — | 0 | — | — | — | — | — | — | — | — | — | — | — | — | 6 | 76.3 | — | — | — | — | — | — | 0 | — | 2025 | — | llm | — | — | — | — | — | — | $0.00 | $0.00 | |
| #103 | 69.4 | 72.7 | 60.2 | 59.7 | 84.8 | 100 | — | 72.7 | 92.4 | — | 52.9 | — | 52.1 | 47 | 89.4 | 80 | — | — | — | — | — | — | 100 | — | — | — | — | — | — | — | 84.8 | — | — | — | — | — | — | — | — | — | — | — | — | 35.4 | 87.4 | — | — | — | — | — | — | 72.7 | — | 2025 | — | llm | API only | — | — | 400K | 73 | 0.69 | $1.75 | $14.00 | |
| #104 | Korea Telecom | 39 | 11 | 40.5 | 18.1 | 86.5 | 78.7 | — | 11 | 72.2 | — | — | — | 33.2 | 3 | 65.6 | — | — | — | — | — | — | — | 78.7 | — | — | — | — | — | — | — | 86.5 | — | — | — | — | — | — | — | — | — | — | — | — | 8.8 | 81.3 | — | — | — | — | — | — | 11 | — | 2025 | — | llm | — | — | — | — | — | — | $0.00 | $0.00 |
| #105 | 7.6 | 0 | 23.5 | 6.7 | 0 | — | — | 0 | 42.5 | — | — | — | 13.3 | 0 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 0 | — | — | — | — | — | — | — | — | — | — | — | — | 4.4 | — | — | — | — | — | — | — | 0 | — | 2025 | — | llm | — | — | — | — | — | — | $0.00 | $0.00 | |
Index 7.6 = (0.0 + 23.5 + 6.7 + 0.0 / 4) — equal-weighted mean of 4 components. General25% 0
Reasoning25% 23.5
Coding25% 6.7
Tool use & agents25% 0
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #106 | 28.1 | 30 | 31.5 | 26 | 24.9 | 36.7 | — | 30 | 59.4 | — | — | — | 33.1 | 18.9 | 44.8 | — | — | — | — | — | — | — | 36.7 | — | — | — | — | — | — | — | 24.9 | — | — | — | — | — | — | — | — | — | — | — | — | 3.6 | 76.2 | — | — | — | — | — | — | 30 | — | 2025 | — | llm | Open weights | — | — | 262K | 51 | 0.64 | $0.40 | $2.00 | |
| #107 | 24.6 | 24 | 28.3 | 22.8 | 23.4 | 34.3 | — | 24 | 53.2 | — | — | — | 28.8 | 16.7 | 34.8 | — | — | — | — | — | — | — | 34.3 | — | — | — | — | — | — | — | 23.4 | — | — | — | — | — | — | — | — | — | — | — | — | 3.4 | 67.8 | — | — | — | — | — | — | 24 | — | 2025 | — | llm | — | — | — | — | 62 | 0.75 | $0.00 | $0.00 | |
| #108 | 33.7 | 40.3 | 40.4 | 22.4 | 31.6 | 85.3 | — | 40.3 | 71.9 | — | — | — | 30.4 | 14.4 | 41.1 | — | — | — | — | — | — | — | 85.3 | — | — | — | — | — | — | — | 31.6 | — | — | — | — | — | — | — | — | — | — | — | — | 8.9 | 79.9 | — | — | — | — | — | — | 40.3 | — | 2025 | — | multimodal | Open weights | — | — | 131K | 44 | 1.31 | $0.30 | $0.90 | |
| #109 | MBZUAI Institute of Foundation Models | 29.8 | 33.3 | 38.9 | 19.2 | 27.8 | 78.3 | — | 33.3 | 68.1 | — | — | — | 28.6 | 9.8 | 69.4 | — | — | — | — | — | — | — | 78.3 | — | — | — | — | — | — | — | 27.8 | — | — | — | — | — | — | — | — | — | — | — | — | 9.8 | 78.6 | — | — | — | — | — | — | 33.3 | — | 2025 | — | llm | — | — | — | — | — | — | $0.00 | $0.00 |
| #110 | Motif Technologies | 28.6 | 13 | 38.9 | 16 | 46.5 | 80.3 | — | 13 | 69.5 | — | — | — | 28.2 | 3.8 | 65.1 | — | — | — | — | — | — | — | 80.3 | — | — | — | — | — | — | — | 46.5 | — | — | — | — | — | — | — | — | — | — | — | — | 8.2 | 79.6 | — | — | — | — | — | — | 13 | — | 2025 | — | llm | — | — | — | — | — | — | $0.00 | $0.00 |
| #111 | 51.8 | 58.3 | 46 | 27.2 | 75.7 | 94.3 | — | 58.3 | 81.1 | — | — | — | 36.9 | 17.4 | 71.1 | — | — | — | — | — | — | — | 94.3 | — | — | — | — | — | — | — | 75.7 | — | — | — | — | — | — | — | — | — | — | — | — | 10.9 | 81.8 | — | — | — | — | — | — | 58.3 | — | 2025 | — | multimodal | API only | — | — | 1M | 229 | 0.89 | $0.30 | $2.50 | |
| #112 | 30.4 | 34.7 | 36.1 | 26.1 | 24.6 | 38 | — | 34.7 | 68 | — | — | — | 36.2 | 15.9 | 46.5 | — | — | — | — | — | — | — | 38 | — | — | — | — | — | — | — | 24.6 | — | — | — | — | — | — | — | — | — | — | — | — | 4.1 | 80.7 | — | — | — | — | — | — | 34.7 | — | 2025 | — | llm | Open weights | 675B (41B active) | — | 262K | 54 | 0.64 | $0.50 | $1.50 | |
| #113 | 23.6 | 22 | 30.9 | 14.1 | 27.2 | 30 | — | 22 | 57.2 | — | — | — | 23.6 | 4.5 | 35.1 | — | — | — | — | — | — | — | 30 | — | — | — | — | — | — | — | 27.2 | — | — | — | — | — | — | — | — | — | — | — | — | 4.6 | 69.3 | — | — | — | — | — | — | 22 | — | 2025 | — | multimodal | Open weights | — | — | 262K | 67 | 0.41 | $0.20 | $0.20 | |
| #114 | 22.3 | 24 | 25.7 | 12.7 | 26.6 | 31.7 | — | 24 | 47.1 | — | — | — | 20.8 | 4.5 | 30.3 | — | — | — | — | — | — | — | 31.7 | — | — | — | — | — | — | — | 26.6 | — | — | — | — | — | — | — | — | — | — | — | — | 4.3 | 64.2 | — | — | — | — | — | — | 24 | — | 2025 | — | multimodal | Open weights | — | — | 262K | 86 | 0.38 | $0.15 | $0.15 | |
| #115 | 16.1 | 11.7 | 20.5 | 7.2 | 24.9 | 22 | — | 11.7 | 35.8 | — | — | — | 14.4 | 0 | 24.7 | — | — | — | — | — | — | — | 22 | — | — | — | — | — | — | — | 24.9 | — | — | — | — | — | — | — | — | — | — | — | — | 5.3 | 52.4 | — | — | — | — | — | — | 11.7 | — | 2025 | — | multimodal | Open weights | — | — | 131K | 154 | 0.34 | $0.10 | $0.10 | |
| #116 | 64.2 | 65 | 53.1 | 48.2 | 90.6 | 92 | — | 65 | 84 | — | — | — | 38.9 | 35.6 | 86.2 | — | 70.2 | — | — | — | — | — | 92 | — | — | — | — | — | — | — | 90.6 | — | — | — | — | — | — | — | — | — | — | — | — | 22.2 | 86.2 | — | — | — | — | — | — | 65 | — | 2025 | — | llm | Open weights | 671B (37B active) | — | 131K | — | — | $0.25 | $0.38 | |
| #117 | 38.8 | 59.3 | 56.6 | 39.4 | 0 | 96.7 | — | 59.3 | 87.1 | — | — | — | 44 | 34.8 | 89.6 | — | — | — | — | — | — | — | 96.7 | — | — | — | — | — | — | — | 0 | — | — | — | — | — | — | — | — | — | — | — | — | 26.1 | 86.3 | — | — | — | — | — | — | 59.3 | — | 2025 | — | llm | Open weights | — | — | 164K | — | — | $0.29 | $0.43 | |
| #118 | 57.9 | 61.7 | 43.7 | 33.5 | 92.7 | 89 | — | 61.7 | 78.5 | — | — | — | 42.7 | 24.2 | 73 | — | — | — | — | — | — | — | 89 | — | — | — | — | — | — | — | 92.7 | — | — | — | — | — | — | — | — | — | — | — | — | 8.9 | 83 | — | — | — | — | — | — | 61.7 | — | 2025 | — | llm | — | — | — | — | 149 | 0.81 | $1.30 | $10.00 | |
| #119 | Prime Intellect | 31.8 | 32.3 | 44.1 | 24.1 | 26.6 | 88 | — | 32.3 | 76.1 | — | — | — | 39.1 | 9.1 | 77.7 | — | — | — | — | — | — | — | 88 | — | — | — | — | — | — | — | 26.6 | — | — | — | — | — | — | — | — | — | — | — | — | 12.1 | 82.2 | — | — | — | — | — | — | 32.3 | — | 2025 | — | llm | Open weights | — | — | 131K | — | — | $0.20 | $1.10 |
| #120 | 49.3 | 53.7 | 41.4 | 21.5 | 80.4 | 89.7 | — | 53.7 | 76 | — | — | — | 36.2 | 6.8 | 66 | — | — | — | — | — | — | — | 89.7 | — | — | — | — | — | — | — | 80.4 | — | — | — | — | — | — | — | — | — | — | — | — | 6.8 | 80.9 | — | — | — | — | — | — | 53.7 | — | 2025 | — | llm | — | — | — | — | — | — | $0.30 | $2.50 | |
| #121 | ServiceNow | 46.8 | 50.3 | 41.6 | 25.9 | 69.3 | 88 | — | 50.3 | 73.3 | — | — | — | 37.3 | 14.4 | 80.7 | — | — | — | — | — | — | — | 88 | — | — | — | — | — | — | — | 69.3 | — | — | — | — | — | — | — | — | — | — | — | — | 9.8 | 79 | — | — | — | — | — | — | 50.3 | — | 2025 | — | llm | — | — | — | — | — | — | $0.00 | $0.00 |
| #122 | 70.1 | 74 | 57.7 | 59.1 | 89.5 | 91.3 | — | 74 | 87 | — | — | — | 49.5 | 47 | 87.1 | 80.9 | — | — | — | — | — | — | 91.3 | — | — | — | — | — | — | — | 89.5 | — | — | — | — | — | — | — | — | — | — | — | — | 28.4 | 89.5 | — | — | — | — | — | — | 74 | — | 2025 | — | llm | API only | — | — | 200K | 58 | 1.50 | $5.00 | $25.00 | |
| #123 | 12.2 | 0 | 33.5 | 15.1 | 0 | 73.7 | — | 0 | 61 | — | — | — | 28.6 | 1.5 | 67.2 | — | — | — | — | — | — | — | 73.7 | — | — | — | — | — | — | — | 0 | — | — | — | — | — | — | — | — | — | — | — | — | 5.9 | 75.9 | — | — | — | — | — | — | 0 | — | 2025 | — | llm | Open weights | — | — | 66K | — | — | $0.15 | $0.50 | |
| #124 | 10.2 | 0 | 22.9 | 5.2 | 12.6 | 41.3 | — | 0 | 40 | — | — | — | 10.3 | 0 | 26.6 | — | — | — | — | — | — | — | 41.3 | — | — | — | — | — | — | — | 12.6 | — | — | — | — | — | — | — | — | — | — | — | — | 5.8 | 52.2 | — | — | — | — | — | — | 0 | — | 2025 | — | llm | — | — | — | — | — | — | $0.10 | $0.20 | |
| #125 | 9.9 | 0 | 28.7 | 11 | 0 | 70.7 | — | 0 | 51.6 | — | — | — | 21.2 | 0.8 | 61.7 | — | — | — | — | — | — | — | 70.7 | — | — | — | — | — | — | — | 0 | — | — | — | — | — | — | — | — | — | — | — | — | 5.7 | 65.5 | — | — | — | — | — | — | 0 | — | 2025 | — | llm | — | — | — | — | — | — | $0.00 | $0.00 | |
Score columns under Index are the v1.2 weighted components (25% each) that feed it. Reference per-category averages (not in the index) follow. Every individual benchmark in our catalog is also shown — grouped by category, ordered by coverage. Hover any header for details — click to sort. Cell colors show relative standing within each column (red → yellow → green). Scores are curated approximations — see each model for sources.