Grok-4 Heavy vs Qwen3 VL 235B A22B Instruct

xAI vs Alibaba — benchmarks, pricing, and capabilities side by side.

•Grok-4 Heavy has the higher intelligence index (89.3 vs 70.9)

	Grok-4 Heavy	Qwen3 VL 235B A22B Instruct
Intelligence index	89.3	70.9
Developer	xAI	Alibaba
Type	Multimodal	Multimodal
Access	API only	Open weights
Context window	—	262,144 tokens
Input price	—	$0.20 / 1M
Output price	—	$0.88 / 1M
Speed	—	51 tok/s
Released	July 9, 2025	September 23, 2025
Parameters	—	—
Input modalities	—	Text, Image
Output modalities	—	Text

Shared benchmarks

Grok-4 Heavy

Qwen3 VL 235B A22B Instruct

AIME 2025

100

70.7

GPQA Diamond

88.4

71.2

Humanity’s Last Exam

50.7

6.3

LiveCodeBench

79.4

59.4

Grok-4 Heavy details Qwen3 VL 235B A22B Instruct details