AI Hub

Fastest AI models

The fastest models by output throughput (tokens per second), limited to models with a measured intelligence index so you are comparing genuinely capable options.

  1. 1Llama 3.3 70B InstructOverall 50.62220 tok/sOutput speed
  2. 2Llama 3.1 8B InstructOverall 452047 tok/sOutput speed
  3. 3Llama 3.1 70B InstructOverall 561204 tok/sOutput speed
  4. 4gpt-oss-20bOverall 73.61000 tok/sOutput speed
  5. 5Mercury 2Overall 77790 tok/sOutput speed
  6. 6Llama 4 ScoutOverall 58.9776 tok/sOutput speed
  7. 7Llama 4 MaverickOverall 63.9639 tok/sOutput speed
  8. 8Granite 4.0 H SmallOverall 35.7524 tok/sOutput speed
  9. 9gpt-oss-120bOverall 79.6500 tok/sOutput speed
  10. 10GPT-5 nanoOverall 71.2500 tok/sOutput speed
  11. 11Granite 3.3 8BOverall 32.5376 tok/sOutput speed
  12. 12Gemini 3.1 Flash LiteOverall 82.2342 tok/sOutput speed
  13. 13Qwen3.5 2BOverall 45.6328 tok/sOutput speed
  14. 14Qwen3 32BOverall 73.8328 tok/sOutput speed
  15. 15Nemotron 3 Nano Omni 30B A3B ReasoningOverall 46.9301 tok/sOutput speed

Rankings are computed from live benchmark data and update automatically. See the full intelligence leaderboard for the overall picture.

Track the leaders as they change

A free daily email: new models, leaderboard movers, and who leads each category.

More rankings