Math

AIME 2025

American Invitational Mathematics Examination — olympiad-level problems; a frontier reasoning test.

Source

221Models

100Top score

61Median

State of the art over time

Each point is a model at its release date; the line traces the best score to date.

Ranking

1	Grok-4 HeavyxAI	100
2	GPT-5.2OpenAI	100
3	GPT-5 CodexOpenAI	98.7
4	Gemini 3 FlashGoogle	97
5	DeepSeek V3.2 SpecialeDeepSeek	96.7
6	MiMo-V2-FlashXiaomi	96.3
7	Claude Haiku 4.5Anthropic	96.3
8	GPT-5.1-CodexOpenAI	95.7
9	Gemini 3 ProGoogle	95.7
10	GLM 4.7Zhipu AI	95
11	KAT-Coder-Pro V1Kuaishou	94.7
12	Kimi K2 ThinkingMoonshot AI	94.7
13	GPT-5OpenAI	94.6
14	Nova 2 LiteAmazon	94.3
15	GPT-5.1OpenAI	94
16	GLM-4.6Zhipu AI	93.9
17	gpt-oss-120bOpenAI	93.4
18	Grok-3xAI	93.3
19	o4-miniOpenAI	92.7
20	Qwen3-235B-A22B-Thinking-2507Alibaba	92.3
21	DeepSeek-V3.2DeepSeek	92
22	Grok 4 FastxAI	92
23	GPT-5.1-Codex-MiniOpenAI	91.7
24	Grok 4xAI	91.7
25	Claude Opus 4.5Anthropic	91.3
26	GPT-5 miniOpenAI	91.1
27	Qwen3 235B A22B 2507Alibaba	91
28	NVIDIA Nemotron 3 Nano 30B A3BNVIDIA	91
29	Grok-3 MinixAI	90.8
30	K-EXAONELG AI Research	90.3
31	Nova 2.0 OmniAmazon	89.7
32	DeepSeek V3.1 TerminusDeepSeek	89.7
33	Ring-1TInclusionAI	89.3
34	Grok 4.1 FastxAI	89.3
35	DeepSeek V3.2 ExpDeepSeek	89.3
36	gpt-oss-20bOpenAI	89.3
37	Nova 2.0 ProAmazon	89
38	Qwen3 VL 235B A22BAlibaba	88.3
39	Apriel-v1.6-15B-ThinkerServiceNow	88
40	INTELLECT-3Prime Intellect	88
41	Gemini 2.5 Pro Preview 06-05Google	88
42	Gemini 2.5 ProGoogle	88
43	Qwen3 Next 80B A3B ThinkingAlibaba	87.8
44	Apriel-v1.5-15B-ThinkerServiceNow	87.5
45	DeepSeek-R1-0528DeepSeek	87.5
46	Claude Sonnet 4.5Anthropic	87
47	o3OpenAI	86.4
48	GLM 4.6VZhipu AI	85.3
49	GPT-5 nanoOpenAI	85.2
50	ERNIE 5.0 ThinkingBaidu	85
51	Seed-OSS-36B-InstructByteDance	84.7
52	Qwen3 VL 32BAlibaba	84.7
53	Grok 3 mini ReasoningxAI	84.7
54	Qwen3-Next-80B-A3BAlibaba	84.3
55	Ring-flash-2.0InclusionAI	83.7
56	Qwen3 4B 2507Alibaba	82.7
57	MiniMax M2.1MiniMax	82.7
58	Qwen3 VL 30B A3BAlibaba	82.3
59	Qwen3 Max ThinkingAlibaba	82.3
60	Magistral Medium 1.2Mistral AI	82
61	Qwen3 235B A22BAlibaba	81.5
62	Qwen3Alibaba	81.5
63	GLM 4.5 AirZhipu AI	80.7
64	Qwen3 MaxAlibaba	80.7
65	Motif-2-12.7B-ReasoningMotif Technologies	80.3
66	Magistral Small 1.2Mistral AI	80.3
67	EXAONE 4.0 32BLG AI Research	80
68	Falcon-H1R-7BTII UAE	80
69	Doubao Seed CodeByteDance	79.3
70	Mi:dm K 2.5 ProKorea Telecom	78.7
71	Gemini 2.5 FlashGoogle	78.3
72	K2-V2MBZUAI Institute of Foundation Models	78.3
73	MiniMax-M2MiniMax	78.3
74	Phi 4 Reasoning PlusMicrosoft	78
75	Claude Opus 4.1Anthropic	78
76	Olmo 3.1 32B ThinkAllen Institute for AI	77.3
77	Llama Nemotron Super 49B v1.5NVIDIA	76.7
78	Claude Opus 4Anthropic	75.5
79	NVIDIA Nemotron Nano 12B v2 VLNVIDIA	75
80	Qwen3 Omni 30B A3BAlibaba	74
81	Olmo 3 32B ThinkAllen Institute for AI	73.7
82	GLM-4.5Zhipu AI	73.7
83	GLM 4.5VZhipu AI	73
84	Qwen3 32BAlibaba	72.9
85	Cogito v2.1Deep Cogito	72.7
86	Llama 3.1 Nemotron Ultra 253B v1NVIDIA	72.5
87	Qwen3 VL 30B A3B InstructAlibaba	72.3
88	Nemotron Nano 9B V2NVIDIA	72.1
89	Gemini 2.5 FlashGoogle	72
90	Ling-1TInclusionAI	71.3
91	Qwen3 30B A3BAlibaba	70.9
92	Olmo 3 7B ThinkAllen Institute for AI	70.7
93	Qwen3 VL 235B A22B InstructAlibaba	70.7
94	Claude Sonnet 4Anthropic	70.5
95	Qwen3-235B-A22B-Instruct-2507Alibaba	70.3
96	Hermes 4 - Llama-3.1 405BNous Research	69.7
97	NVIDIA Nemotron Nano 9B V2NVIDIA	69.7
98	Qwen3 Next 80B A3B InstructAlibaba	69.5
99	Gemini 2.5 Flash-LiteGoogle	68.7
100	Hermes 4 - Llama-3.1 70BNous Research	68.7
101	Qwen3 VL 32B InstructAlibaba	68.3
102	DeepSeek-R1DeepSeek	68
103	Qwen3 30B A3B 2507 InstructAlibaba	66.3
104	Ling-flash-2.0InclusionAI	65.3
105	Magistral MediumMistral AI	64.9
106	DeepSeek R1 0528 Qwen3 8BDeepSeek	63.7
107	DeepSeek R1 Distill Qwen 32BDeepSeek	63
108	Phi 4 ReasoningMicrosoft	62.9
109	Magistral Small 2506Mistral AI	62.8
110	Solar Pro 2Upstage	61.3
111	MiniMax M1 80kMiniMax	61
112	Claude 3.7 SonnetAnthropic	61
113	HyperCLOVA X SEED ThinkNaver	59
114	Llama-3.3 Nemotron Super 49B v1NVIDIA	58.4
115	Qwen3 14BAlibaba	58
116	Kimi K2 0905Moonshot AI	57.3
117	Kimi K2Moonshot AI	57
118	Qwen3 30B A3B 2507Alibaba	56.3
119	DeepSeek R1 Distill Qwen 14BDeepSeek	55.7
120	DeepSeek R1 Distill Llama 70BDeepSeek	53.7
121	Qwen3 4B 2507 InstructAlibaba	52.3
122	Qwen3 Omni 30B A3B InstructAlibaba	52.3
123	Exaone 4.0 1.2BLG AI Research	50.3
124	Llama 3.1 Nemotron Nano 4B v1.1NVIDIA	50
125	Gemini 2.5 Flash LiteGoogle	49.8
126	DeepSeek-V3.1DeepSeek	49.8
127	Kimi K2-Instruct-0905Moonshot AI	49.5
128	Kimi K2 InstructMoonshot AI	49.5
129	Ling-mini-2.0InclusionAI	49.3
130	Llama 3.1 Nemotron Nano 8B V1NVIDIA	47.1
131	GPT-4.1OpenAI	46.4
132	Grok Code Fast 1xAI	43.3
133	Magistral Small 1Mistral AI	41.3
134	Olmo 3 7B InstructAllen Institute for AI	41.3
135	DeepSeek R1 Distill Llama 8BDeepSeek	41.3
136	ERNIE 4.5 300B A47BBaidu	41.3
137	DeepSeek-V3 0324DeepSeek	41
138	Magistral Medium 1Mistral AI	40.3
139	GPT-4.1 MiniOpenAI	40.2
140	Qwen3 Coder 480B A35B InstructAlibaba	39.3
141	Qwen3 1.7BAlibaba	38.7
142	Mistral Medium 3.1Mistral AI	38.3
143	Mistral Large 3Mistral AI	38
144	Qwen3 VL 4B InstructAlibaba	37
145	Devstral 2Mistral AI	36.7
146	Kimi Linear 48B A3B InstructMoonshot AI	36.3
147	Devstral Small 2Mistral AI	34.3
148	Reka Flash 3Reka AI	33.7
149	Ministral 3 8BMistral AI	31.7
150	Qwen3 VL 8BAlibaba	30.7
151	Mistral Medium 3Mistral AI	30.3
152	Ministral 3 14BMistral AI	30
153	Devstral SmallMistral AI	29.3
154	QwQ-32BAlibaba	29
155	Qwen3 Coder 30B A3B InstructAlibaba	29
156	Qwen3 VL 8B InstructAlibaba	27.3
157	Mistral Small 3.2Mistral AI	27
158	DeepSeek-V3DeepSeek	26
159	Qwen3 VL 4BAlibaba	25.7
160	GPT-4oOpenAI	25.7
161	LFM2 8B A1BLiquid AI	25.3
162	Qwen3 8BAlibaba	24.3
163	GPT-4.1 NanoOpenAI	24
164	Gemini DiffusionGoogle	23.3
165	Qwen3 4BAlibaba	22.3
166	DeepSeek R1 Distill Qwen 1.5BDeepSeek	22
167	Ministral 3 3BMistral AI	22
168	Gemini 2.0 FlashGoogle	21.7
169	Gemma 3 27B InstructGoogle	20.7
170	Llama 4 MaverickMeta	19.3
171	Gemma 3 12B InstructGoogle	18.3
172	Qwen3 0.6BAlibaba	18
173	Phi 4Microsoft	18
174	Nova PremierAmazon	17.3
175	GPT-4o-miniOpenAI	14.7
176	Gemma 3n E4B InstructGoogle	14.3
177	Qwen2.5 72B InstructAlibaba	14
178	Llama 4 ScoutMeta	14
179	Mistral Large 2Mistral AI	14
180	MiniMax M1 40kMiniMax	13.7
181	Granite 4.0 H SmallIBM	13.7
182	Command ACohere	13
183	Gemma 3 4B InstructGoogle	12.7
184	Gemma 3n E4B Instructed LiteRT PreviewGoogle	11.6
185	Gemma 3n E4B InstructedGoogle	11.6
186	Llama 3.1 Nemotron 70B InstructNVIDIA	11
187	Jamba Reasoning 3BAI21 Labs	10.7
188	Gemma 3n E2B InstructGoogle	10.3
189	LFM2 2.6BLiquid AI	8.3
190	Llama 3.3 70B InstructMeta	7.7
191	Nova ProAmazon	7
192	Nova LiteAmazon	7
193	Granite 3.3 8BIBM	6.7
194	Gemma 3n E2B Instructed LiteRT (Preview)Google	6.7
195	Gemma 3n E2B InstructedGoogle	6.7
196	Phi 4 Mini InstructMicrosoft	6.7
197	Granite 4.0 H 1BIBM	6.3
198	Granite 4.0 1BIBM	6.3
199	Nova MicroAmazon	6
200	Granite 4.0 MicroIBM	6
201	Devstral MediumMistral AI	4.7
202	Llama 3.1 8B InstructMeta	4.3
203	Mistral Small 3Mistral AI	4.3
204	Llama 3.1 70B InstructMeta	4
205	Mistral Small 3.1Mistral AI	3.7
206	OLMo 2 32BAllen Institute for AI	3.3
207	LFM2 1.2BLiquid AI	3.3
208	Gemma 3 1B InstructGoogle	3.3
209	Llama 3.2 3B InstructMeta	3.3
210	Llama 3.1 405B InstructMeta	3
211	Gemma 3 270MGoogle	2.3
212	Pixtral LargeMistral AI	2.3
213	Jamba Large 1.7AI21 Labs	2.3
214	Llama 3.2 11B InstructMeta	1.7
215	Granite 4.0 H 350MIBM	1.3
216	OLMo 2 7BAllen Institute for AI	0.7
217	Phi-3 Mini Instruct 3.8BMicrosoft	0.3
218	Jamba 1.7 MiniAI21 Labs	0.3
219	Granite 4.0 350MIBM	0
220	Molmo 7B-DAllen Institute for AI	0
221	Llama 3.2 1B InstructMeta	0

Related Math benchmarks

MATH-500169 MATH67 AIME 202446 GSM8K45 MGSM29 HMMT 202511