Gemini 3.5 Flash vs Llama 3.1 Nemotron Ultra 253B v1

Google vs NVIDIA — benchmarks, pricing, and capabilities side by side.

•Gemini 3.5 Flash has the higher intelligence index (92.2 vs 78.3)
•Llama 3.1 Nemotron Ultra 253B v1 is cheaper ($0.60 vs $1.50 per 1M input)
•Gemini 3.5 Flash is faster

	Gemini 3.5 Flash	Llama 3.1 Nemotron Ultra 253B v1
Intelligence index	92.2	78.3
Developer	Google	NVIDIA
Type	Multimodal	LLM
Access	API only	Open weights
Context window	1,048,576 tokens	—
Input price	$1.50 / 1M	$0.60 / 1M
Output price	$9.00 / 1M	$1.80 / 1M
Speed	221 tok/s	42 tok/s
Released	May 19, 2026	April 7, 2025
Parameters	—	253000000000
Input modalities	Text, Image, Audio, Video	—
Output modalities	Text	—

Shared benchmarks

Gemini 3.5 Flash

Llama 3.1 Nemotron Ultra 253B v1

GPQA Diamond

92.2

76

Humanity’s Last Exam

41

8.1

Gemini 3.5 Flash details Llama 3.1 Nemotron Ultra 253B v1 details