Gemini 3.5 Flash vs UI-TARS 7B
Google vs ByteDance — benchmarks, pricing, and capabilities side by side.
- •UI-TARS 7B is cheaper ($0.10 vs $1.50 per 1M input)
- •Gemini 3.5 Flash has a larger context window (1M)
| Gemini 3.5 Flash | UI-TARS 7B | |
|---|---|---|
| Intelligence index | 92.2 | — |
| Developer | ByteDance | |
| Type | Multimodal | Multimodal |
| Access | API only | Open weights |
| Context window | 1,048,576 tokens | 128,000 tokens |
| Input price | $1.50 / 1M | $0.10 / 1M |
| Output price | $9.00 / 1M | $0.20 / 1M |
| Speed | 221 tok/s | — |
| Released | May 19, 2026 | July 22, 2025 |
| Parameters | — | — |
| Input modalities | Text, Image, Audio, Video | Image, Text |
| Output modalities | Text | Text |