OpenAI's flagship unified reasoning + chat model replacing the GPT-4 line.
OpenAI's next-generation flagship model with adaptive reasoning capabilities
| Benchmark | GPT-5 |
|---|---|
| MMLU | 91.0% |
| HumanEval | 93.5% |
| MATH | 90.1% |
| MMLU-Pro | 87.5% |
| GPQA Diamond | 74.2% |
| SWE-bench Verified | 67.4% |