Near-Opus quality at a fraction of the cost, with Agent Teams orchestration.
Anthropic's latest Sonnet with Agent Teams capability and near-Opus performance at a fraction of the cost
| Benchmark | Claude Sonnet 4.6 | Claude Sonnet 4 | Δ |
|---|---|---|---|
| SWE-bench | 80.8% | — | — |
| MMLU | 92.1% | 88.7% | +3.4 |
| HumanEval | 95.2% | 94.5% | +0.7 |
| SWE-bench Verified | 80.8% | 72.3% | +8.5 |
| GPQA Diamond | 78.4% | 74.0% | +4.4 |
| AIME 2025 | 88.5% | — | — |
| TAU-bench | 71.2% | — | — |