GPT-5.5 vs Grok 4.3: Benchmarks, Pricing & Capabilities Compared
TL;DR — GPT-5.5 wins for reasoning · Grok 4.3 wins for cost + long-context.
GPT-5.5 OpenAI
- Released
- 2026-04-23
- Context window
- 922K tokens
- Input price
- $5.00 / Mtok
- Output price
- $30.00 / Mtok
Grok 4.3 xAI
- Released
- 2026-04-30
- Context window
- 1M tokens
- Input price
- $1.25 / Mtok
- Output price
- $2.50 / Mtok
Benchmark comparison
| Benchmark | GPT-5.5 | Grok 4.3 |
|---|---|---|
| AA Intelligence Index | 60.2 ✓ | 53.2 |
| Chatbot Arena Elo | 1475 ✓ | 1455 |
| GPQA Diamond | 93.5% ✓ | 90.1% |
| HLE | 44.3% ✓ | 35.0% |
| IF-Bench | 75.9% | 81.3% ✓ |
| LiveCodeBench Reasoning | 74.3% ✓ | 64.3% |
| SciCode | 56.1% ✓ | 47.3% |
| TAU2-bench | 93.9% | 97.7% ✓ |
| TerminalBench-Hard | 60.6% ✓ | 37.9% |
Pricing comparison
| Metric | GPT-5.5 | Grok 4.3 |
|---|---|---|
| Input ($/Mtok) | $5.00 | $1.25 |
| Output ($/Mtok) | $30.00 | $2.50 |
| Cached input ($/Mtok) | — | — |
| Cost per 1M-token roundtrip (1M in + 1M out) | $35.00 | $3.75 |
Context window & modalities
| Attribute | GPT-5.5 | Grok 4.3 |
|---|---|---|
| Context window | 922K tokens | 1M tokens |
| Input modalities | text, image | text, image |
| Output modalities | text | text |
| Knowledge cutoff | — | — |
Verdict by use case
Coding
Insufficient data
Basis: SWE-bench
No shared coding benchmark.
Reasoning
→ GPT-5.5
Basis: GPQA Diamond
GPT-5.5 93.5% vs Grok 4.3 90.1% on GPQA Diamond.
Math
Insufficient data
Basis: MATH / AIME
No shared math benchmark.
Long context
→ Grok 4.3
Basis: Context window
GPT-5.5 922K tokens vs Grok 4.3 1M tokens.
Cost
→ Grok 4.3
Basis: Input $/Mtok
GPT-5.5 $5/Mtok vs Grok 4.3 $1.25/Mtok input.