The model that changed everything — GPT-4 set the standard for capable AI
OpenAI's most advanced system producing safer and more useful responses
| Benchmark | GPT-4 | ChatGPT (GPT-3.5 Turbo) | Δ |
|---|---|---|---|
| MMLU | 86.4% | 70.0% | +16.4 |
| HumanEval | 67.0% | 48.1% | +18.9 |
| MATH | 52.9% | 34.1% | +18.8 |
| Bar exam | ~90th percentile | — | — |