AI War Tracker
298 models in catalog

AI models

Every model we track — frontier flagships, open-weights specialists, narrow benchmarks-only releases. Filter by lab, country, access, modality, or release window. Sorted by newest by default; head to the leaderboard for the ranked view.

Leaderboard →Labs →Benchmarks →

Updated May 30, 2026 · Benchmarks via Artificial Analysis, specs & pricing via OpenRouter · Methodology · Spotted an error?

RankModelIndexGeneralReasonCodingAgentsMathMultiLong ctxGPQA DiamondDROPARC-AGI-2BIG-Bench HardSciCodeTerminal-BenchLiveCodeBenchSWE-bench VerifiedAider PolyglotHumanEvalAider Polyglot EditMBPPMultiPL-ESWE-bench ProAIME 2025MATH-500AIME 2024MATHGSM8KMGSMHMMT 2025FrontierMathτ²-benchTAU-bench RetailTAU-bench AirlineBFCLBrowseCompτ²-bench Airlineτ²-bench RetailMMMUMathVistaChartQADocVQAMMMU-ProAI2DHumanity’s Last ExamMMLU-ProMMLUIFEvalSimpleQAMulti-IFLiveBenchArena HardAA-LCRLongBench-v2ReleasedCountryTypeAccessParamsCutoffContextSpeedLatencyIn $/MOut $/M
Sarvam8.2022.510.1084.7041.617.82.329.584.703.369.602025llm1361.17$0.00$0.00
Anthropic58.164.742.552.472.684.874.464.775.44035.565.572.761.370.599.164.680.56074.49.684.28864.72025llmAPI only20251M1010.40$3.00$15.00
Anthropic50.73633.356.277.486.93679.68.640.939.263.672.57275.598.273.481.459.611.787.388.8362025llmAPI only2025200K1200.40$15.00$75.00
Mistral AI25.926.723.715.33848.926.743.424.56.125.829.368.438463.226.72025llm1900.42$0.10$0.30
Upstage21.8037.917.431.979068.730.24.561.661.396.731.9780.502025llm$0.00$0.00
NVIDIA11.202310.111.772.4040.810.149.35094.711.75.155.602025llm$0.00$0.00
Google6.9017.35.4545.7029.68.62.314.614.377.154.948.802025llm560.55$0.00$0.00
Mistral AI25.52831.118.524.360.52857.833.13.84030.390.724.34.376282025multimodalAPI only2025131K320.56$0.40$2.00
Amazon29.13030.817.338.350.63056.927.96.831.717.383.938.34.773.3302025llm401.31$2.50$12.50
Alibaba28.5037.626.150.183.5066.835.4365.74072.996.181.429.870.38.379.874.993.802025llmOpen weights2025131K3280.93$0.08$0.28
Alibaba25.4029.6234986.7047.588.939.96.170.781.465.981.59385.771.894.483.527.270.811.768.287.877.195.602025llmOpen weights2025131K680.78$0.46$1.82
Alibaba25.4036.217.747.682.4065.828.56.862.670.995.980.42669.16.677.772.274.39102025llmOpen weights2025131K1220.66$0.09$0.45
Alibaba21.4032.418.534.577.1060.431.65.352.35896.134.54.377.402025llmOpen weights2025132K621.01$0.10$0.24
Alibaba18031.612.527.857.4058.922.62.340.624.390.427.84.274.302025llmOpen weights2025131K691.29$0.05$0.40
Alibaba16.1028.716.71957.8052.216.746.522.393.3195.169.602025llm1031.02$0.10$0.40
Alibaba12.5020.43.52664.1035.66.9030.838.789.4265.25702025llm1380.97$0.10$0.40
Alibaba9.5014.82.121.146.5023.94.1012.1187521.15.734.702025llm2250.95$0.10$1.30
OpenAI56.369.333.657.165.273.38269.387.76.54137.180.869.181.386.499.291.615.880.749.764.880.282.986.876.424.385.369.32025llmAPI only2024200K5020.00$2.00$8.00
OpenAI53.15548.149.759.69582.95581.446.515.285.968.168.958.292.798.993.455.671.849.251.581.684.314.783.2552025multimodalAPI only2024200K1155.20$1.10$4.40
IBM9.74.3195.110.536.64.333.810.1012.76.766.510.54.246.84.32025llm37620.60$0.00$0.30
OpenAI48.56135.939.557.653.773.56166.338.113.645.754.651.69452.946.491.348.18728.947.16849.474.872.25.480.690.287.470.8612025multimodalAPI only20241M10010.00$2.00$8.00
OpenAI39.442.334.426.654.454.372.942.36540.47.648.323.634.731.640.292.549.63552.955.83672.773.13.778.187.584.16742.32025multimodalAPI only20241M1505.00$0.40$1.60
OpenAI19.31727.113.22046.155.81750.325.93.832.69.86.22484.829.417.322.61455.456.23.965.780.174.557.2172025multimodalAPI only20241M2002.00$0.10$0.40
NVIDIA19.87.342.118.511.484.87.37634.72.366.372.59711.48.182.589.57.32025llmOpen weights2530000000002023420.72$0.60$1.80
Meta30.64637.321.417.854.178.24669.833.16.843.43015.677.619.388.961.292.317.873.473.79094.459.64.880.585.5462025multimodalOpen weights400B total / 17B active (MoE)20241M6390.20$0.15$0.60

Score columns under Index are the v1.2 weighted components (25% each) that feed it. Reference per-category averages (not in the index) follow. Every individual benchmark in our catalog is also shown — grouped by category, ordered by coverage. Hover any header for details — click to sort. Cell colors show relative standing within each column (red → yellow → green). Scores are curated approximations — see each model for sources.