Fastest AI models
The fastest models by output throughput (tokens per second), limited to models with a measured intelligence index so you are comparing genuinely capable options.
- 1Llama 3.3 70B InstructOverall 50.62220 tok/sOutput speed
- 2Llama 3.1 8B InstructOverall 452047 tok/sOutput speed
- 3Llama 3.1 70B InstructOverall 561204 tok/sOutput speed
- 4gpt-oss-20bOverall 73.61000 tok/sOutput speed
- 5Mercury 2Overall 77790 tok/sOutput speed
- 6Llama 4 ScoutOverall 58.9776 tok/sOutput speed
- 7Llama 4 MaverickOverall 63.9639 tok/sOutput speed
- 8Granite 4.0 H SmallOverall 35.7524 tok/sOutput speed
- 9gpt-oss-120bOverall 79.6500 tok/sOutput speed
- 10GPT-5 nanoOverall 71.2500 tok/sOutput speed
- 11Granite 3.3 8BOverall 32.5376 tok/sOutput speed
- 12Gemini 3.1 Flash LiteOverall 82.2342 tok/sOutput speed
- 13Qwen3.5 2BOverall 45.6328 tok/sOutput speed
- 14Qwen3 32BOverall 73.8328 tok/sOutput speed
- 15Nemotron 3 Nano Omni 30B A3B ReasoningOverall 46.9301 tok/sOutput speed
Rankings are computed from live benchmark data and update automatically. See the full intelligence leaderboard for the overall picture.
Track the leaders as they change
A free daily email: new models, leaderboard movers, and who leads each category.