AI Hub
← All models

Gemini 3.5 Flash vs Llama 3.1 Nemotron Ultra 253B v1

Google vs NVIDIA — benchmarks, pricing, and capabilities side by side.

  • Gemini 3.5 Flash has the higher intelligence index (92.2 vs 78.3)
  • Llama 3.1 Nemotron Ultra 253B v1 is cheaper ($0.60 vs $1.50 per 1M input)
  • Gemini 3.5 Flash is faster
Gemini 3.5 FlashLlama 3.1 Nemotron Ultra 253B v1
Intelligence index92.278.3
DeveloperGoogleNVIDIA
TypeMultimodalLLM
AccessAPI onlyOpen weights
Context window1,048,576 tokens
Input price$1.50 / 1M$0.60 / 1M
Output price$9.00 / 1M$1.80 / 1M
Speed221 tok/s42 tok/s
ReleasedMay 19, 2026April 7, 2025
Parameters253000000000
Input modalitiesText, Image, Audio, Video
Output modalitiesText

Shared benchmarks

Gemini 3.5 Flash
Llama 3.1 Nemotron Ultra 253B v1
GPQA Diamond
92.2
76
Humanity’s Last Exam
41
8.1