AI Hub

Benchmarks

The evaluations behind the rankings — what each one measures, and which models lead. Scores feed the per-category indices on the leaderboard.

42
Benchmarks
440
Models scored
2668
Data points
7
Categories

8 benchmarks

Leader
AIME 2024Math46Grok-3 Mini95.8/100
AIME 2025Math221Grok-4 Heavy100/100
FrontierMathMath6GPT-526.3/100
GSM8KMath45Kimi K2 Instruct97.3/100
HMMT 2025Math11Grok 4 Fast93.3/100
MATHMath67o3-mini97.9/100
MATH-500Math169GPT-599.4/100
MGSMMath29Llama 4 Maverick92.3/100