FrontierMath
A benchmark of hundreds of original, exceptionally challenging mathematics problems crafted and vetted by expert mathematicians, covering most major branches of modern mathematics
6Models
26.3Top score
9.6Median
A benchmark of hundreds of original, exceptionally challenging mathematics problems crafted and vetted by expert mathematicians, covering most major branches of modern mathematics from number theory and real analysis to algebraic geometry and category theory.
State of the art over time
Each point is a model at its release date; the line traces the best score to date.
Ranking
| 1 | GPT-5 | 26.3 |
| 2 | GPT-5 mini | 22.1 |
| 3 | o3 | 15.8 |
| 4 | GPT-5 nano | 9.6 |
| 5 | o3-mini | 9.2 |
| 6 | o1 | 5.5 |