AI models
Every model we track — frontier flagships, open-weights specialists, narrow benchmarks-only releases. Filter by lab, country, access, modality, or release window. Sorted by newest by default; head to the leaderboard for the ranked view.
Updated May 29, 2026 · Benchmarks via Artificial Analysis, specs & pricing via OpenRouter · Methodology · Spotted an error?
| Rank | Model | Lab | Country | Type | Access | Params | Released ↓ | Cutoff | Index | General | Reason | Coding | Agents | Math | Multi | Long ctx | GPQA Diamond | DROP | ARC-AGI-2 | BIG-Bench Hard | SciCode | Terminal-Bench | LiveCodeBench | SWE-bench Verified | Aider Polyglot | HumanEval | Aider Polyglot Edit | MBPP | MultiPL-E | SWE-bench Pro | AIME 2025 | MATH-500 | AIME 2024 | MATH | GSM8K | MGSM | HMMT 2025 | FrontierMath | τ²-bench | TAU-bench Retail | TAU-bench Airline | BFCL | BrowseComp | τ²-bench Airline | τ²-bench Retail | MMMU | MathVista | ChartQA | DocVQA | MMMU-Pro | AI2D | Humanity’s Last Exam | MMLU-Pro | MMLU | IFEval | SimpleQA | Multi-IF | LiveBench | Arena Hard | AA-LCR | LongBench-v2 | Context | Speed | Latency | In $/M | Out $/M |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| #276 | o1 | OpenAI | — | llm | API only | — | 2024 | 2023 | 45.1 | 53.2 | 30.4 | 29.9 | 66.7 | 58.9 | 74.7 | 59.3 | 78 | — | — | — | 35.8 | 12.9 | 67.9 | 41 | — | 88.1 | — | — | — | — | — | 97 | 74.3 | 96.4 | 97.1 | 89.3 | — | 5.5 | 62.6 | 70.8 | 50 | — | — | — | — | 77.6 | 71.8 | — | — | — | — | 7.7 | 84.1 | 92 | — | 47 | — | 67 | — | 59.3 | — | 200K | 66 | 0.54 | $15.00 | $60.00 |
| #277 | OLMo 2 7B | Allen Institute for AI | — | llm | — | — | 2024 | — | 4.8 | 0 | 17.2 | 1.9 | 0 | 0.7 | — | 0 | 28.8 | — | — | — | 3.7 | 0 | 4.1 | — | — | — | — | — | — | — | 0.7 | — | — | — | — | — | — | — | 0 | — | — | — | — | — | — | — | — | — | — | — | — | 5.5 | 28.2 | — | — | — | — | — | — | 0 | — | — | — | — | $0.00 | $0.00 |
| #278 | Nova Pro | Amazon | — | multimodal | API only | — | 2024 | — | 24.7 | 19 | 25.2 | 13.5 | 41.2 | 42.8 | 81.5 | 19 | 46.9 | 85.4 | — | 86.9 | 20.8 | 6.1 | 23.3 | — | — | 89 | — | — | — | — | 7 | 78.6 | — | 76.6 | 94.8 | — | — | — | 14 | — | — | 68.4 | — | — | — | 61.7 | — | 89.2 | 93.5 | — | — | 3.4 | 69.1 | 85.9 | 92.1 | — | — | — | — | 19 | — | 300K | 100 | 0.50 | $0.80 | $3.20 |
| #279 | Nova Lite | Amazon | — | multimodal | API only | — | 2024 | — | 22.6 | 17.7 | 23.3 | 7.4 | 42.1 | 41.8 | 78.5 | 17.7 | 42 | 80.2 | — | 82.4 | 13.9 | 0.8 | 16.7 | — | — | 85.4 | — | — | — | — | 7 | 76.5 | — | 73.3 | 94.5 | — | — | — | 17.5 | — | — | 66.6 | — | — | — | 56.2 | — | 86.8 | 92.4 | — | — | 4.6 | 59 | 80.5 | 89.7 | — | — | — | — | 17.7 | — | 300K | 100 | 0.50 | $0.06 | $0.24 |
| #280 | Nova Micro | Amazon | — | llm | API only | — | 2024 | — | 18.2 | 9.7 | 22.4 | 5.5 | 35.1 | 38.2 | — | 9.7 | 40 | 79.3 | — | 79.5 | 9.4 | 1.5 | 14 | — | — | 81.1 | — | — | — | — | 6 | 70.3 | — | 69.3 | 92.3 | — | — | — | 14 | — | — | 56.2 | — | — | — | — | — | — | — | — | — | 4.7 | 53.1 | 77.6 | 87.2 | — | — | — | — | 9.7 | — | 128K | 100 | 0.50 | $0.03 | $0.14 |
| #281 | Pixtral Large | Mistral AI | — | multimodal | API only | — | 2024 | 2024 | 25.8 | 10.3 | 27.1 | 29.2 | 36.5 | 36.9 | 81.7 | 10.3 | 50.5 | — | — | — | 29.2 | — | 26.1 | — | — | — | — | — | — | — | 2.3 | 71.4 | — | — | — | — | — | — | 36.5 | — | — | — | — | — | — | 64 | 69.4 | 88.1 | 93.3 | — | 93.8 | 3.6 | 70.1 | — | — | — | — | — | — | 10.3 | — | 131K | 0 | 0.50 | $2.00 | $6.00 |
| #282 | Claude 3.5 Haiku | Anthropic | — | llm | API only | — | 2024 | 2024 | 26.8 | 23.3 | 22.6 | 23.4 | 37.8 | 72.1 | — | 23.3 | 41.6 | 83.1 | — | — | 27.4 | 2.3 | 31.4 | 40.6 | — | 88.1 | — | — | — | — | — | 72.1 | — | 69.4 | — | 85.6 | — | — | 24.6 | 51 | 22.8 | — | — | — | — | — | — | — | — | — | — | 3.5 | 65 | 80.9 | — | — | — | — | — | 23.3 | — | 200K | 104 | 0.30 | $0.80 | $4.00 |
| #283 | Llama 3.1 Nemotron 70B Instruct | NVIDIA | — | llm | Open weights | 70000000000 | 2024 | 2023 | 17.4 | 7 | 25.6 | 13.9 | 23.1 | 42.2 | — | 7 | 46.5 | — | — | — | 23.3 | 4.5 | 16.9 | — | — | — | — | — | — | — | 11 | 73.3 | — | — | 91.4 | — | — | — | 23.1 | — | — | — | — | — | — | — | — | — | — | — | — | 4.6 | 69 | 80.2 | — | — | — | — | — | 7 | — | — | 292 | 0.24 | $1.20 | $1.20 |
| #284 | Llama 3.2 11B Instruct | Meta | — | multimodal | Open weights | 10600000000 | 2024 | 2023 | 12.8 | 11.7 | 19 | 6 | 14.6 | 26.7 | 66.4 | 11.7 | 32.8 | — | — | — | 11.2 | 0.8 | 11 | — | — | — | — | — | — | — | 1.7 | 51.6 | — | 51.9 | — | 68.9 | — | — | 14.6 | — | — | — | — | — | — | 50.7 | 51.5 | 83.4 | 88.4 | 33 | 91.1 | 5.2 | 46.4 | 73 | — | — | — | — | — | 11.7 | — | 128K | 168 | 0.20 | $0.05 | $0.05 |
| #285 | Llama 3.2 3B Instruct | Meta | — | llm | Open weights | — | 2024 | 2023 | 11.8 | 2 | 19 | 5.2 | 21.1 | 26.1 | — | 2 | 32.8 | — | — | — | 5.2 | — | 8.3 | — | — | — | — | — | — | — | 3.3 | 48.9 | — | 48 | 77.7 | 58.2 | — | — | 21.1 | — | — | — | — | — | — | — | — | — | — | — | — | 5.2 | 34.7 | 63.4 | 77.4 | — | — | — | — | 2 | — | 131K | 172 | 0.24 | $0.05 | $0.34 |
| #286 | Llama 3.2 1B Instruct | Meta | — | llm | Open weights | — | 2024 | 2023 | 4.6 | 5 | 12.5 | 0.9 | 0 | 7 | — | 5 | 19.6 | — | — | — | 1.7 | 0 | 1.9 | — | — | — | — | — | — | — | 0 | 14 | — | — | — | — | — | — | 0 | — | — | — | — | — | — | — | — | — | — | — | — | 5.3 | 20 | — | — | — | — | — | — | 5 | — | 131K | 91 | 0.60 | $0.03 | $0.20 |
| #287 | Molmo 7B-D | Allen Institute for AI | — | llm | — | — | 2024 | — | 4.1 | 0 | 14.6 | 1.8 | 0 | 0 | — | 0 | 24 | — | — | — | 3.6 | 0 | 3.9 | — | — | — | — | — | — | — | 0 | — | — | — | — | — | — | — | 0 | — | — | — | — | — | — | — | — | — | — | — | — | 5.1 | 37.1 | — | — | — | — | — | — | 0 | — | — | — | — | $0.00 | $0.00 |
| #288 | Qwen2.5 72B Instruct | Alibaba | — | llm | Open weights | — | 2024 | 2024 | 24.3 | 20.3 | 26.6 | 15.6 | 34.5 | 49.9 | — | 20.3 | 49 | — | — | — | 26.7 | 4.5 | 55.5 | — | — | 86.6 | — | 88.2 | 75.1 | — | 14 | 85.8 | — | 83.1 | 95.8 | — | — | — | 34.5 | — | — | — | — | — | — | — | — | — | — | — | — | 4.2 | 71.1 | — | 84.1 | — | — | 52.3 | 81.2 | 20.3 | — | 131K | 100 | 0.37 | $0.36 | $0.40 |
| #289 | Mistral Large 2 | Mistral AI | — | llm | Open weights | 123B | 2024 | — | 20.6 | 5.3 | 26.3 | 17.7 | 33 | 43.8 | — | 5.3 | 48.6 | — | — | — | 29.2 | 6.1 | 29.3 | — | — | 92 | — | — | — | — | 14 | 73.6 | — | — | 93 | — | — | — | 33 | — | — | — | — | — | — | — | — | — | — | — | — | 4 | 69.7 | 84 | — | — | — | — | — | 5.3 | — | 128K | 42 | 0.40 | $2.00 | $6.00 |
| #290 | Llama 3.1 405B Instruct | Meta | — | llm | Open weights | 405000000000 | 2024 | — | 31 | 24.3 | 27.5 | 18.3 | 53.8 | 36.7 | — | 24.3 | 50.7 | 84.8 | — | — | 29.9 | 6.8 | 30.5 | — | — | 89 | — | — | — | — | 3 | 70.3 | — | 73.8 | 96.8 | — | — | — | 19 | — | — | 88.5 | — | — | — | — | — | — | — | — | — | 4.2 | 73.3 | 87.3 | 88.6 | — | — | — | — | 24.3 | — | 128K | 100 | 0.40 | $0.89 | $0.89 |
| #291 | Llama 3.1 70B Instruct | Meta | — | llm | Open weights | — | 2024 | 2023 | 23.6 | 6.3 | 23.2 | 14.9 | 50 | 34.5 | — | 6.3 | 41.7 | 79.6 | — | — | 26.7 | 3 | 23.2 | — | — | 80.5 | — | — | — | — | 4 | 64.9 | — | — | — | — | — | — | 15.2 | — | — | 84.8 | — | — | — | — | — | — | — | — | — | 4.6 | 66.4 | 83.6 | 87.5 | — | — | — | — | 6.3 | — | 131K | 1204 | 0.20 | $0.40 | $0.40 |
| #292 | Llama 3.1 8B Instruct | Meta | — | llm | Open weights | — | 2024 | 2023 | 21.7 | 15.7 | 17.8 | 7 | 46.3 | 28.1 | — | 15.7 | 30.4 | 59.5 | — | — | 13.2 | 0.8 | 11.6 | — | — | 72.6 | — | — | — | — | 4.3 | 51.9 | — | — | — | — | — | — | 16.4 | — | — | 76.1 | — | — | — | — | — | — | — | — | — | 5.1 | 48.3 | 69.4 | 80.4 | — | — | — | — | 15.7 | — | 131K | 2047 | 0.20 | $0.02 | $0.05 |
| #293 | GPT-4o | OpenAI | — | multimodal | API only | — | 2024 | 2023 | 38.8 | 45.6 | 37.7 | 27.2 | 44.6 | 42.7 | 77.7 | 53 | 70.1 | — | — | — | 36.6 | 8.3 | 42.5 | 33.2 | 30.7 | 90.2 | 18.2 | — | — | — | 25.7 | 89.3 | 13.1 | — | — | — | — | — | 28.9 | 60.3 | 42.8 | — | — | 45.5 | 63.4 | 72.2 | 61.4 | 85.7 | 92.8 | 59.9 | 94.2 | 5.3 | 74.7 | 88.7 | 81 | 38.2 | 60.9 | — | — | 53 | — | 128K | 132 | 0.50 | $2.50 | $10.00 |
| #294 | Phi-3 Mini Instruct 3.8B | Microsoft | — | llm | — | — | 2024 | — | 6.2 | 2 | 18.2 | 4.5 | 0 | 23 | — | 2 | 31.9 | — | — | — | 9 | 0 | 11.6 | — | — | — | — | — | — | — | 0.3 | 45.7 | — | — | — | — | — | — | 0 | — | — | — | — | — | — | — | — | — | — | — | — | 4.4 | 43.5 | — | — | — | — | — | — | 2 | — | — | — | — | $0.00 | $0.00 |
| #295 | Llama 3 70B Instruct | Meta | — | llm | Open weights | — | 2024 | 2023 | 7.8 | 0 | 21.2 | 9.9 | 0 | 48.3 | — | 0 | 37.9 | — | — | — | 18.9 | 0.8 | 19.8 | — | — | — | — | — | — | — | — | 48.3 | — | — | — | — | — | — | 0 | — | — | — | — | — | — | — | — | — | — | — | — | 4.4 | 57.4 | — | — | — | — | — | — | 0 | — | 8K | 45 | 0.70 | $0.51 | $0.74 |
| #296 | Llama 3 8B Instruct | Meta | — | llm | Open weights | — | 2024 | 2023 | 5.9 | 0 | 17.4 | 6 | 0 | 49.9 | — | 0 | 29.6 | — | — | — | 11.9 | 0 | 9.6 | — | — | — | — | — | — | — | — | 49.9 | — | — | — | — | — | — | 0 | — | — | — | — | — | — | — | — | — | — | — | — | 5.1 | 40.5 | — | — | — | — | — | — | 0 | — | 8K | 81 | 0.51 | $0.04 | $0.04 |
| #297 | Claude 3 Haiku | Anthropic | — | multimodal | API only | — | 2024 | 2023 | 17.6 | 21 | 18.6 | 9.7 | 21.1 | 39.4 | — | 21 | 33.3 | 78.4 | — | 73.7 | 18.6 | 0.8 | 15.4 | — | — | 75.9 | — | — | — | — | — | 39.4 | — | 38.9 | 88.9 | 75.1 | — | — | 21.1 | — | — | — | — | — | — | — | — | — | — | — | — | 3.9 | — | 75.2 | — | — | — | — | — | 21 | — | 200K | 104 | 0.40 | $0.25 | $1.25 |
| #298 | Mistral 7B Instruct | Mistral AI | — | llm | — | — | 2023 | — | 3.4 | 0 | 11 | 2.4 | 0 | 12.1 | — | 0 | 17.7 | — | — | — | 2.4 | — | 4.6 | — | — | — | — | — | — | — | — | 12.1 | — | — | — | — | — | — | 0 | — | — | — | — | — | — | — | — | — | — | — | — | 4.3 | 24.5 | — | — | — | — | — | — | 0 | — | — | 90 | 0.39 | $0.20 | $0.20 |
Score columns: the first four are the v1.2 weighted components (25% each) that feed the Index. After the first divider, legacy per-category averages (reference, not in the index). After the second divider, every individual benchmark in our catalog. Hover any header for details — click to sort. Cell colors show relative standing within each column (red → yellow → green). Scores are curated approximations — see each model for sources.