AI War Tracker
298 models in catalog

AI models

Every model we track — frontier flagships, open-weights specialists, narrow benchmarks-only releases. Filter by lab, country, access, modality, or release window. Sorted by newest by default; head to the leaderboard for the ranked view.

Leaderboard →Labs →Benchmarks →

Updated May 30, 2026 · Benchmarks via Artificial Analysis, specs & pricing via OpenRouter · Methodology · Spotted an error?

RankModelIndexGeneralReasonCodingAgentsMathMultiLong ctxGPQA DiamondDROPARC-AGI-2BIG-Bench HardSciCodeTerminal-BenchLiveCodeBenchSWE-bench VerifiedAider PolyglotHumanEvalAider Polyglot EditMBPPMultiPL-ESWE-bench ProAIME 2025MATH-500AIME 2024MATHGSM8KMGSMHMMT 2025FrontierMathτ²-benchTAU-bench RetailTAU-bench AirlineBFCLBrowseCompτ²-bench Airlineτ²-bench RetailMMMUMathVistaChartQADocVQAMMMU-ProAI2DHumanity’s Last ExamMMLU-ProMMLUIFEvalSimpleQAMulti-IFLiveBenchArena HardAA-LCRLongBench-v2ReleasedCountryTypeAccessParamsCutoffContextSpeedLatencyIn $/MOut $/M
Meta20.425.830.89.315.549.280.825.857.2171.532.867.81484.450.390.615.569.470.788.894.44.374.379.625.82025multimodalOpen weights109B total / 17B active (MoE)202410M7760.31$0.08$0.30
Google50.158.435.652.454.192.279.666844.942.826.580.163.876.572.78896.7929254.179.617.88650.8662025multimodalAPI only20251M850.70$1.25$10.00
DeepSeek40.14136.835.447.164.84168.435.815.249.255.1419459.447.15.281.2412025llmOpen weights671000000000164K$0.28$1.14
NVIDIA23.71736.614.126.977.51766.728.202891.358.496.626.96.578.588.3172025llmOpen weights499000000002023$0.00$0.00
Mistral AI21.819.725.117.125.137.219.745.426.57.621.23.770.725.14.865.919.72025llm1340.52$0.10$0.20
Cohere50.54643.831.480.747.54676.137.82528.71381.980.711.471.2462025llmOpen weights2024256K2030.17$2.50$10.00
Google6.4014.50.410.525.9023.70.701.73.348.410.55.213.502025llm$0.00$0.00
Allen Institute for AI5.6018.3403.3032.8806.83.303.751.102025llm$0.00$0.00
Google13.15.723.812.510.554.55.742.821.23.813.720.788.310.54.766.95.72025llm$0.10$0.30
Google11.66.719.89.110.851.86.734.917.40.813.718.385.310.84.859.56.72025llm$0.10$0.30
Reka AI10.602913.4061.5052.926.7043.533.789.305.166.902025llmOpen weights202566K932.81$0.10$0.20
Google85.717.24.1544.75.729.17.30.811.212.776.655.241.75.72025llm$0.00$0.10
Alibaba39.12536.728.466.466.42565.235.863.420.92990.679.566.48.276.483.973.1252025llmOpen weights325000000002024310.45$0.70$1.00
OpenAI60.162.571.43868.436.773.871.4388844.936.7859768.45075.272.390.888.262.570.82025multimodalAPI only128K5020.00$75.00$150.00
Anthropic57.360.747.652.76879.17560.784.840.335.247.370.364.96196.2808254.781.258.47510.383.786.193.260.72025llmAPI only200K1010.40$3.00$15.00
xAI53.750.345.12990.49250.379.140.617.469.684.799.290.411.182.850.32025llm330.52$0.30$0.50
xAI43.154.744.824.148.891.27854.784.636.811.479.493.38793.348.8785.18054.72025multimodalAPI only2024128K1000.70$3.00$15.00
OpenAI36.327.232.940.744.56539.377.239.96.873.449.366.760.498.587.397.9929.231.357.632.412.380.286.993.91579.584.639.32025llmAPI only2023200K1155.20$1.10$4.40
Mistral AI17.1025.223.619.637.9046.223.625.24.371.519.64.165.202025llmOpen weights202333K1360.53$0.05$0.08
DeepSeek34.352.340.432.911.482.352.371.535.76.161.756.96896.611.49.384.490.852.32025llmOpen weights671B total / 37B active (MoE)128K1890.07$0.55$2.19
DeepSeek21.31135.716.421.978.31165.231.31.557.553.794.586.721.96.179.5112025llmOpen weights70600000000128K370.65$0.10$0.40
Microsoft11.61.530.114.9049.5056.175.5263.823.182.8188180.480.604.170.484.863347.675.402025llmOpen weights202416K330.20$0.07$0.14
DeepSeek30.534.231.433.522.851.838.959.191.635.46.837.64249.679.72690.239.222.83.675.988.586.124.92948.72024llmOpen weights671B total / 37B active (MoE)2024131K1000.50$0.23$0.91
Google27.928.333.72029.557.470.728.362.1343.835.122.221.79389.729.570.75.376.48728.32024multimodalAPI only20241M1830.40$0.10$0.40
Meta20.91527.314.526.642.51550.526328.888.47.777.37791.126.6468.98692.1152024llmOpen weights2023131K22200.50$0.10$0.32

Score columns under Index are the v1.2 weighted components (25% each) that feed it. Reference per-category averages (not in the index) follow. Every individual benchmark in our catalog is also shown — grouped by category, ordered by coverage. Hover any header for details — click to sort. Cell colors show relative standing within each column (red → yellow → green). Scores are curated approximations — see each model for sources.