MMLU
Massive Multitask Language Understanding — 57 subjects spanning STEM, humanities, and social sciences.
92Models
92.5Top score
81.2Median
State of the art over time
Each point is a model at its release date; the line traces the best score to date.