The frontier AI model tracker
Every major model release in one place — benchmarks, context windows, pricing, and the launch coverage that matters. Updated as models ship.
Top 10 on the leaderboard
All benchmarks →| # | Model | Company | Score |
|---|---|---|---|
| 1 | Gemini 3.1 Pro Preview | 94.1% | |
| 2 | GPT-5.5 | OpenAI | 93.5% |
| 3 | GPT-5.4 | OpenAI | 92.0% |
| 4 | GPT-5.3 Codex | OpenAI | 91.5% |
| 5 | Claude Opus 4.7 | Anthropic | 91.4% |
| 6 | Kimi K2.6 | Kimi | 91.1% |
| 7 | Grok 4.20 0309 v2 | xAI | 91.1% |
| 8 | GPT-5.2 | OpenAI | 90.3% |
| 9 | Grok 4.3 | xAI | 90.1% |
| 10 | DeepSeek V4 Flash | DeepSeek | 89.4% |
All tracked models
See full hub →| Model | Company | Released |
|---|---|---|
| MiniCPM-V 4.6 1.3B | OpenBMB | 2026-05-11 |
| Grok 4.3 | xAI | 2026-04-30 |
| Granite 4.1 30B | IBM | 2026-04-29 |
| Granite 4.1 3B | IBM | 2026-04-29 |
| Granite 4.1 8B | IBM | 2026-04-29 |
| Mistral Medium 3.5 | Mistral | 2026-04-29 |
| Nemotron 3 Nano Omni 30B A3B Reasoning | NVIDIA | 2026-04-29 |
| DeepSeek V4 Flash | DeepSeek | 2026-04-24 |
| DeepSeek V4 Pro | DeepSeek | 2026-04-24 |
| Ling-2.6-1T | InclusionAI | 2026-04-23 |
| GPT-5.5 | OpenAI | 2026-04-23 |
| Hy3-preview | Tencent | 2026-04-23 |
| Qwen3.6 27B | Alibaba | 2026-04-22 |
| MiMo-V2.5 | Xiaomi | 2026-04-22 |
| MiMo-V2.5-Pro | Xiaomi | 2026-04-22 |
| Ling 2.6 Flash | InclusionAI | 2026-04-21 |
| Qwen3.6 Max Preview | Alibaba | 2026-04-20 |
| Kimi K2.6 | Kimi | 2026-04-20 |
| Qwen3.6 35B A3B | Alibaba | 2026-04-16 |
| Claude Opus 4.7 | Anthropic | 2026-04-16 |
| EXAONE 4.5 33B | LG AI Research | 2026-04-09 |
| Muse Spark | Meta | 2026-04-08 |
| GLM-5.1 | Z AI | 2026-04-07 |
| Grok 4.20 0309 v2 | xAI | 2026-04-07 |
| Solar Pro 3 | Upstage | 2026-04-06 |
| Gemma 4 E4B | 2026-04-03 | |
| Qwen3.6 Plus | Alibaba | 2026-04-02 |
| Gemma 4 26B A4B | 2026-04-02 | |
| Gemma 4 31B | 2026-04-02 | |
| Gemma 4 E2B | 2026-04-02 | |
| Step 3.5 Flash 2603 | StepFun | 2026-04-02 |
| Trinity Large Thinking | Arcee AI | 2026-04-01 |
| GLM 5V Turbo | Z AI | 2026-04-01 |
| Qwen3.5 Omni Flash | Alibaba | 2026-03-30 |
| Qwen3.5 Omni Plus | Alibaba | 2026-03-30 |
| MiMo-V2-Omni-0327 | Xiaomi | 2026-03-27 |
| Nemotron Cascade 2 30B A3B | NVIDIA | 2026-03-19 |
| MiMo-V2-Omni | Xiaomi | 2026-03-19 |
| MiniMax-M2.7 | MiniMax | 2026-03-18 |
| MiMo-V2-Pro | Xiaomi | 2026-03-18 |
| GPT-5.4 mini | OpenAI | 2026-03-17 |
| GPT-5.4 nano | OpenAI | 2026-03-17 |
| Mistral Small 4 | Mistral | 2026-03-16 |
| NVIDIA Nemotron 3 Nano 4B | NVIDIA | 2026-03-16 |
| GLM-5-Turbo | Z AI | 2026-03-15 |
| NVIDIA Nemotron 3 Super 120B A12B | NVIDIA | 2026-03-11 |
| Grok 4.20 0309 | xAI | 2026-03-10 |
| Sarvam 105B | Sarvam | 2026-03-06 |
| Sarvam 30B | Sarvam | 2026-03-06 |
| GPT-5.4 | OpenAI | 2026-03-05 |
| Gemini 3.1 Flash-Lite Preview | 2026-03-03 | |
| Qwen3.5 0.8B | Alibaba | 2026-03-02 |
| Qwen3.5 2B | Alibaba | 2026-03-02 |
| Qwen3.5 4B | Alibaba | 2026-03-02 |
| Qwen3.5 9B | Alibaba | 2026-03-02 |
| LFM2 24B A2B | Liquid AI | 2026-02-25 |
| Qwen3.5 122B A10B | Alibaba | 2026-02-24 |
| Qwen3.5 27B | Alibaba | 2026-02-24 |
| Qwen3.5 35B A3B | Alibaba | 2026-02-24 |
| Mercury 2 | Inception | 2026-02-20 |
| Gemini 3.1 Pro Google's flagship reasoning model with a 2x jump on hard multi-step tasks. | 2026-02-19 | |
| Gemini 3.1 Pro Preview | 2026-02-19 | |
| Claude Sonnet 4.6 Near-Opus quality at a fraction of the cost, with Agent Teams orchestration. | Anthropic | 2026-02-17 |
| Tiny Aya Global | Cohere | 2026-02-17 |
| Qwen3.5 397B A17B | Alibaba | 2026-02-16 |
| DeepSeek V3.2 Open-weight MoE with a 1M+ token context window and strong coding. | DeepSeek | 2026-02-12 |
| MiniMax-M2.5 | MiniMax | 2026-02-12 |
| GLM-5 First frontier model trained entirely on Huawei Ascend silicon. | Zhipu AI | 2026-02-11 |
| Nanbeige4.1-3B | Nanbeige | 2026-02-11 |
| GLM-5 | Z AI | 2026-02-11 |
| GPT-5.3 Codex Coding-specialized variant of GPT-5.3, tuned for agentic IDE workflows. | OpenAI | 2026-02-05 |
| Claude Opus 4.6 | Anthropic | 2026-02-05 |
| Qwen3 Coder Next | Alibaba | 2026-02-03 |
| Step 3.5 Flash | StepFun | 2026-02-02 |
| LongCat Flash Lite | LongCat | 2026-01-28 |
| Kimi K2.5 | Kimi | 2026-01-27 |
| Qwen3 Max Thinking | Alibaba | 2026-01-26 |
| Kimi K2 Moonshot's open-weight frontier MoE with strong agentic benchmarks. | Moonshot AI | 2026-01-20 |
| LFM2.5-1.2B-Thinking | Liquid AI | 2026-01-20 |
| Step3 VL 10B | StepFun | 2026-01-20 |
| GLM-4.7-Flash | Z AI | 2026-01-19 |
| LFM2.5-1.2B-Instruct | Liquid AI | 2026-01-05 |
| GPT-5.2 Codex Prior-gen Codex variant of GPT-5.2 for agentic coding. | OpenAI | 2025-12-18 |
| Mistral Large 3 Mistral's flagship proprietary model, tuned for European enterprise. | Mistral | 2025-12-15 |
| GPT-5.2 Late-2025 GPT-5 refresh with improved reasoning and steerability. | OpenAI | 2025-12-11 |
| Claude Opus 4.5 Anthropic's top-tier reasoning model for complex research and agents. | Anthropic | 2025-11-24 |
| Gemini 3 Pro First Gemini 3 tier release; strong multimodal + long-context. | 2025-11-18 | |
| GPT-5.1 Maintenance update to GPT-5 with steerability + latency improvements. | OpenAI | 2025-11-12 |
| GPT-5 OpenAI's flagship unified reasoning + chat model replacing the GPT-4 line. | OpenAI | 2025-08-15 |
| Claude Opus 4.1 Mid-2025 Opus refresh focused on agentic coding reliability. | Anthropic | 2025-07-15 |
| Gemini 2.5 Flash Google's cost-optimized multimodal model with thinking mode. | 2025-06-20 | |
| Claude Sonnet 4 Claude 4 mid-tier with strong coding and long-horizon agentic reliability. | Anthropic | 2025-05-22 |
| Claude Sonnet 3.7 Extended-thinking update to Sonnet 3.5 with visible reasoning toggle. | Anthropic | 2025-02-24 |
| DeepSeek-V3 DeepSeek's breakthrough open-weight MoE rivaling GPT-4-class quality. | DeepSeek | 2024-12-26 |
| Gemini 2.0 Flash First Gemini 2 model — fast, cheap, multimodal, with tool use native. | 2024-12-11 | |
| Grok-2 Elon's second-gen Grok — real-time X/Twitter data access | xAI | 2024-08-13 |
| Claude 3.5 Sonnet The mid-2024 Sonnet release that set the SOTA bar for coding and agents. | Anthropic | 2024-06-20 |
| Claude 3 Opus Anthropic's most powerful pre-Claude 4 model — tops GPT-4 on reasoning | Anthropic | 2024-03-04 |
| Claude 3 Sonnet Balanced Claude 3 variant — best price/performance in the family | Anthropic | 2024-03-04 |
| Claude 3 Haiku Fastest and cheapest Claude 3 — sub-second latency at $0.25/M | Anthropic | 2024-03-04 |
| Mistral Large | Mistral | 2024-02-26 |
| Gemini 1.5 Pro Google's first 1M-context model — multimodal needle-in-haystack champion | 2024-02-15 | |
| text-embedding-3-large OpenAI's best embedding model — 3× cheaper than ada-002 with better MTEB | OpenAI | 2024-01-25 |
| GPT-4 Turbo GPT-4 with 128K context and knowledge through April 2023 — 3× cheaper than GPT-4 | OpenAI | 2024-01-25 |
| Grok-1 xAI's open-source release — 314B MoE, first frontier model fully open-sourced | xAI | 2023-12-07 |
| AlphaCode 2 DeepMind's coding specialist — top 15% of competitive programmers | Google DeepMind | 2023-12-06 |
| Gemini Ultra Google's first model to beat GPT-4 on MMLU — 90%+ with CoT | 2023-12-06 | |
| Gemini Pro Google's workhorse Gemini model — free tier in Google AI Studio | 2023-12-06 | |
| Claude 2.1 Claude 2 update — 200K context and reduced hallucinations | Anthropic | 2023-11-21 |
| Whisper v3 State-of-the-art open speech recognition — 99 languages, open weights | OpenAI | 2023-11-06 |
| Code Llama 34B Meta's code-specialized open model — top open-source coding at launch | Meta | 2023-08-24 |
| Llama 2 70B Meta's open-weight landmark — 70B that matched GPT-3.5 and ignited open AI | Meta | 2023-07-18 |
| Claude 2 Claude's first major leap — 100K context and better at instructions | Anthropic | 2023-07-11 |
| PaLM 2 Google's multilingual flagship — powers Bard 2023, 100+ languages | 2023-05-10 | |
| GPT-4 The model that changed everything — GPT-4 set the standard for capable AI | OpenAI | 2023-03-14 |
| Claude 1.3 Anthropic's first public model — 100K context ahead of its time | Anthropic | 2023-03-14 |
| ChatGPT (GPT-3.5 Turbo) The product that launched the AI era — 100M users in 2 months | OpenAI | 2022-11-30 |
| PaLM Google's 540B pathways model — first to demonstrate chain-of-thought at scale | 2022-04-04 | |
| Codex GPT-3 trained on code — the engine behind GitHub Copilot v1 | OpenAI | 2021-08-10 |
| GPT-3 The model that showed scaling works — 175B parameters, few-shot learning pioneer | OpenAI | 2020-06-11 |
Daily AI news, condensed
A curated daily roundup of frontier-AI news — model launches, research, deals, and product moves.
Read today's brief →