AI War Tracker
298 models in catalog

AI models

Every model we track — frontier flagships, open-weights specialists, narrow benchmarks-only releases. Filter by lab, country, access, modality, or release window. Sorted by newest by default; head to the leaderboard for the ranked view.

Leaderboard →Labs →Benchmarks →

Updated May 30, 2026 · Benchmarks via Artificial Analysis, specs & pricing via OpenRouter · Methodology · Spotted an error?

RankModelIndexGeneralReasonCodingAgentsMathMultiLong ctxGPQA DiamondDROPARC-AGI-2BIG-Bench HardSciCodeTerminal-BenchLiveCodeBenchSWE-bench VerifiedAider PolyglotHumanEvalAider Polyglot EditMBPPMultiPL-ESWE-bench ProAIME 2025MATH-500AIME 2024MATHGSM8KMGSMHMMT 2025FrontierMathτ²-benchTAU-bench RetailTAU-bench AirlineBFCLBrowseCompτ²-bench Airlineτ²-bench RetailMMMUMathVistaChartQADocVQAMMMU-ProAI2DHumanity’s Last ExamMMLU-ProMMLUIFEvalSimpleQAMulti-IFLiveBenchArena HardAA-LCRLongBench-v2ReleasedCountryTypeAccessParamsCutoffContextSpeedLatencyIn $/MOut $/M
OpenAI60.26654.144.7766681.746.942.47626.5662026llmAPI only2025400K1570.55$0.20$1.25
Mistral AI39.244.743.227.741.244.776.93817.441.29.544.72026multimodalOpen weights262K1450.51$0.15$0.60
NVIDIA21.116.728.111.628.116.751.316.46.828.14.816.72026llm$0.00$0.00
Zhipu AI63.260.755.138.598.560.784.743.633.398.525.460.72026llmAPI only203K$1.20$4.00
NVIDIA52.56049.632.467.860803628.867.819.2602026llm2111.01$0.30$0.80
xAI64.45959.342.896.55988.544.740.996.530592026llm970.62$2.00$6.00
Alibaba54.75946.92686.85980.627.724.286.813.3592026multimodalOpen weights262K510.33$0.04$0.15
Sarvam25.7041.91446.8073.826.41.546.810.102026llm1281.29$0.00$0.00
Sarvam20.1035.210.834.5063.319.22.334.5702026llm2141.17$0.00$0.00
OpenAI71.37466.857.187.1749256.657.657.787.141.6742026llmAPI only1.1M840.63$2.50$15.00
Inception46.536.346.332.670.836.37738.726.570.815.536.32026llmAPI only128K7906.11$0.25$0.75
Alibaba52.155.742.418.392.155.777.118.318.292.17.855.72026llm1640.24$0.00$0.20
Alibaba3423.725.35.581.623.745.67.23.881.64.923.72026llm3280.24$0.00$0.10
Alibaba21.96.714.31.565.26.723.62.9065.24.96.72026llm1200.23$0.00$0.10
Alibaba62.966.754.636.693.666.785.74231.193.623.466.72026multimodalOpen weights262K1291.07$0.26$2.08
Alibaba62.867.35436.193.967.385.839.532.693.922.267.32026multimodalOpen weights262K911.48$0.20$1.56
Alibaba5962.752.132.189.262.784.537.726.589.219.762.72026multimodalOpen weights262K1211.07$0.14$1.00
Liquid AI10.6025.95.511.1047.410.9011.14.402026llmOpen weights128K2080.30$0.03$0.12
OpenAI69.77465.753.1867491.553.2538639.9742026multimodalAPI only400K7381.08$1.75$14.00
Google76.272.771.964.495.672.794.377.158.953.880.695.644.472.72026multimodalAPI only1M14226.02$2.00$12.00
Anthropic67.270.758.659.879.570.787.558.346.95379.679.53070.72026llmAPI only1M751.13$3.00$15.00
Cohere4.9017.91.80030.53.6005.202026llm1260.35$0.00$0.00
Alibaba65.365.758.341.595.665.789.34240.995.627.365.72026multimodalOpen weights262K531.82$0.39$2.34
#74MiniMax63665238.795.36684.842.634.895.319.1662026llmOpen weights205K871.16$0.15$1.15
Index 63 = (66.0 + 52.0 + 38.7 + 95.3 / 4) — equal-weighted mean of 4 components.
General25%
66
  • SimpleQA
  • AA-LCR66
  • LongBench-v2
  • IFBench
Reasoning25%
52
  • GPQA Diamond84.8
  • Humanity’s Last Exam19.1
  • FrontierMath
  • ARC-AGI-2
Coding25%
38.7
  • SWE-bench Verified
  • Terminal-Bench34.8
  • Aider Polyglot
  • SciCode42.6
Tool use & agents25%
95.3
  • TAU-bench Retail
  • τ²-bench95.3
  • BFCL
  • BrowseComp
Zhipu AI68.563.356.655.798.263.38646.243.277.898.227.263.32026llmOpen weights744B (44B active)203K670.77$0.60$1.92

Score columns under Index are the v1.2 weighted components (25% each) that feed it. Reference per-category averages (not in the index) follow. Every individual benchmark in our catalog is also shown — grouped by category, ordered by coverage. Hover any header for details — click to sort. Cell colors show relative standing within each column (red → yellow → green). Scores are curated approximations — see each model for sources.