Benchmarks

			Leader
AIME 2024	Math	46	Grok-3 Mini	95.8/100
AIME 2025	Math	221	Grok-4 Heavy	100/100
FrontierMath	Math	6	GPT-5	26.3/100
GSM8K	Math	45	Kimi K2 Instruct	97.3/100
HMMT 2025	Math	11	Grok 4 Fast	93.3/100
MATH	Math	67	o3-mini	97.9/100
MATH-500	Math	169	GPT-5	99.4/100
MGSM	Math	29	Llama 4 Maverick	92.3/100