GPT-5.1 vs Hermes 4 - Llama-3.1 70B
OpenAI vs Nous Research — benchmarks, pricing, and capabilities side by side.
- •GPT-5.1 has the higher intelligence index (89 vs 71.3)
- •Hermes 4 - Llama-3.1 70B is cheaper ($0.10 vs $1.25 per 1M input)
- •GPT-5.1 is faster
| GPT-5.1 | Hermes 4 - Llama-3.1 70B | |
|---|---|---|
| Intelligence index | 89 | 71.3 |
| Developer | OpenAI | Nous Research |
| Type | LLM | LLM |
| Access | API only | — |
| Context window | 400,000 tokens | — |
| Input price | $1.25 / 1M | $0.10 / 1M |
| Output price | $10.00 / 1M | $0.40 / 1M |
| Speed | 115 tok/s | 60 tok/s |
| Released | November 12, 2025 | August 27, 2025 |
| Parameters | — | — |
| Input modalities | Text, Image | — |
| Output modalities | Text | — |
Shared benchmarks
GPT-5.1
Hermes 4 - Llama-3.1 70B
AIME 2025
94
68.7
GPQA Diamond
88.1
69.9
Humanity’s Last Exam
26.5
7.9
LiveCodeBench
86.8
65.3
MMLU-Pro
87
81.1