Gemini 3.1 Pro vs GLM-5: Benchmarks, Pricing & Capabilities Compared
TL;DR — Gemini 3.1 Pro wins for reasoning + long-context · GLM-5 wins for cost.
Gemini 3.1 Pro Google
- Released
- 2026-02-19
- Context window
- 2M tokens
- Input price
- $2.50 / Mtok
- Output price
- $10.00 / Mtok
Key features
- 2x reasoning improvement
- ARC-AGI-2 score of 77.1%
- Enhanced multimodal understanding
GLM-5 Zhipu AI
- Released
- 2026-02-11
- Context window
- 200K tokens
- Input price
- $0.11 / Mtok
- Output price
- $0.28 / Mtok
Key features
- First frontier model trained on Huawei Ascend chips (no NVIDIA)
- #1 HLE score (50.4%)
- 1.2% hallucination rate via Slime RL
Benchmark comparison
No benchmarks reported on both models.
Pricing comparison
| Metric | Gemini 3.1 Pro | GLM-5 |
|---|---|---|
| Input ($/Mtok) | $2.50 | $0.11 |
| Output ($/Mtok) | $10.00 | $0.28 |
| Cached input ($/Mtok) | $0.25 | — |
| Cost per 1M-token roundtrip (1M in + 1M out) | $12.50 | $0.39 |
Context window & modalities
| Attribute | Gemini 3.1 Pro | GLM-5 |
|---|---|---|
| Context window | 2M tokens | 200K tokens |
| Input modalities | text, image, audio, video, PDF | text, image |
| Output modalities | text | text |
| Knowledge cutoff | 2025-12 | 2025-11 |
Verdict by use case
Coding
Insufficient data
Basis: SWE-bench
No shared coding benchmark.
Reasoning
→ Gemini 3.1 Pro
Basis: MMLU-Pro
Gemini 3.1 Pro 93.8% vs GLM-5 88.7% on MMLU-Pro.
Math
Insufficient data
Basis: MATH / AIME
No shared math benchmark.
Long context
→ Gemini 3.1 Pro
Basis: Context window
Gemini 3.1 Pro 2M tokens vs GLM-5 200K tokens.
Cost
→ GLM-5
Basis: Input $/Mtok
Gemini 3.1 Pro $2.5/Mtok vs GLM-5 $0.11/Mtok input.
Changelog & releases
Gemini 3.1 Pro
Released 2026-02-19
Predecessor: google-gemini-3-pro
- 2x reasoning score on ARC-AGI-2 vs Gemini 3 Pro
- Context window expanded to 2M tokens
- Deep Think mode enabled by default on the Pro tier
- Lower latency on first-token despite larger context
GLM-5
Released 2026-02-11
- Trained entirely on Huawei Ascend 910B clusters (no NVIDIA)
- Slime RL fine-tuning drops hallucination rate to 1.2%
- 136x cheaper than Claude Opus 4.5 at comparable quality