AI Flash Report

The frontier AI model tracker

Every major model release in one place — benchmarks, context windows, pricing, and the launch coverage that matters. Updated as models ship.

Top 10 on the leaderboard

All benchmarks →
# Model Company Score
1 Gemini 3.1 Pro Preview Google 94.1%
2 GPT-5.5 OpenAI 93.5%
3 GPT-5.4 OpenAI 92.0%
4 GPT-5.3 Codex OpenAI 91.5%
5 Claude Opus 4.7 Anthropic 91.4%
6 Kimi K2.6 Kimi 91.1%
7 Grok 4.20 0309 v2 xAI 91.1%
8 GPT-5.2 OpenAI 90.3%
9 Grok 4.3 xAI 90.1%
10 DeepSeek V4 Flash DeepSeek 89.4%

All tracked models

See full hub →
All 120 tracked models
Model Company Released
MiniCPM-V 4.6 1.3B OpenBMB 2026-05-11
Grok 4.3 xAI 2026-04-30
Granite 4.1 30B IBM 2026-04-29
Granite 4.1 3B IBM 2026-04-29
Granite 4.1 8B IBM 2026-04-29
Mistral Medium 3.5 Mistral 2026-04-29
Nemotron 3 Nano Omni 30B A3B Reasoning NVIDIA 2026-04-29
DeepSeek V4 Flash DeepSeek 2026-04-24
DeepSeek V4 Pro DeepSeek 2026-04-24
Ling-2.6-1T InclusionAI 2026-04-23
GPT-5.5 OpenAI 2026-04-23
Hy3-preview Tencent 2026-04-23
Qwen3.6 27B Alibaba 2026-04-22
MiMo-V2.5 Xiaomi 2026-04-22
MiMo-V2.5-Pro Xiaomi 2026-04-22
Ling 2.6 Flash InclusionAI 2026-04-21
Qwen3.6 Max Preview Alibaba 2026-04-20
Kimi K2.6 Kimi 2026-04-20
Qwen3.6 35B A3B Alibaba 2026-04-16
Claude Opus 4.7 Anthropic 2026-04-16
EXAONE 4.5 33B LG AI Research 2026-04-09
Muse Spark Meta 2026-04-08
GLM-5.1 Z AI 2026-04-07
Grok 4.20 0309 v2 xAI 2026-04-07
Solar Pro 3 Upstage 2026-04-06
Gemma 4 E4B Google 2026-04-03
Qwen3.6 Plus Alibaba 2026-04-02
Gemma 4 26B A4B Google 2026-04-02
Gemma 4 31B Google 2026-04-02
Gemma 4 E2B Google 2026-04-02
Step 3.5 Flash 2603 StepFun 2026-04-02
Trinity Large Thinking Arcee AI 2026-04-01
GLM 5V Turbo Z AI 2026-04-01
Qwen3.5 Omni Flash Alibaba 2026-03-30
Qwen3.5 Omni Plus Alibaba 2026-03-30
MiMo-V2-Omni-0327 Xiaomi 2026-03-27
Nemotron Cascade 2 30B A3B NVIDIA 2026-03-19
MiMo-V2-Omni Xiaomi 2026-03-19
MiniMax-M2.7 MiniMax 2026-03-18
MiMo-V2-Pro Xiaomi 2026-03-18
GPT-5.4 mini OpenAI 2026-03-17
GPT-5.4 nano OpenAI 2026-03-17
Mistral Small 4 Mistral 2026-03-16
NVIDIA Nemotron 3 Nano 4B NVIDIA 2026-03-16
GLM-5-Turbo Z AI 2026-03-15
NVIDIA Nemotron 3 Super 120B A12B NVIDIA 2026-03-11
Grok 4.20 0309 xAI 2026-03-10
Sarvam 105B Sarvam 2026-03-06
Sarvam 30B Sarvam 2026-03-06
GPT-5.4 OpenAI 2026-03-05
Gemini 3.1 Flash-Lite Preview Google 2026-03-03
Qwen3.5 0.8B Alibaba 2026-03-02
Qwen3.5 2B Alibaba 2026-03-02
Qwen3.5 4B Alibaba 2026-03-02
Qwen3.5 9B Alibaba 2026-03-02
LFM2 24B A2B Liquid AI 2026-02-25
Qwen3.5 122B A10B Alibaba 2026-02-24
Qwen3.5 27B Alibaba 2026-02-24
Qwen3.5 35B A3B Alibaba 2026-02-24
Mercury 2 Inception 2026-02-20
Gemini 3.1 Pro
Google's flagship reasoning model with a 2x jump on hard multi-step tasks.
Google 2026-02-19
Gemini 3.1 Pro Preview Google 2026-02-19
Claude Sonnet 4.6
Near-Opus quality at a fraction of the cost, with Agent Teams orchestration.
Anthropic 2026-02-17
Tiny Aya Global Cohere 2026-02-17
Qwen3.5 397B A17B Alibaba 2026-02-16
DeepSeek V3.2
Open-weight MoE with a 1M+ token context window and strong coding.
DeepSeek 2026-02-12
MiniMax-M2.5 MiniMax 2026-02-12
GLM-5
First frontier model trained entirely on Huawei Ascend silicon.
Zhipu AI 2026-02-11
Nanbeige4.1-3B Nanbeige 2026-02-11
GLM-5 Z AI 2026-02-11
GPT-5.3 Codex
Coding-specialized variant of GPT-5.3, tuned for agentic IDE workflows.
OpenAI 2026-02-05
Claude Opus 4.6 Anthropic 2026-02-05
Qwen3 Coder Next Alibaba 2026-02-03
Step 3.5 Flash StepFun 2026-02-02
LongCat Flash Lite LongCat 2026-01-28
Kimi K2.5 Kimi 2026-01-27
Qwen3 Max Thinking Alibaba 2026-01-26
Kimi K2
Moonshot's open-weight frontier MoE with strong agentic benchmarks.
Moonshot AI 2026-01-20
LFM2.5-1.2B-Thinking Liquid AI 2026-01-20
Step3 VL 10B StepFun 2026-01-20
GLM-4.7-Flash Z AI 2026-01-19
LFM2.5-1.2B-Instruct Liquid AI 2026-01-05
GPT-5.2 Codex
Prior-gen Codex variant of GPT-5.2 for agentic coding.
OpenAI 2025-12-18
Mistral Large 3
Mistral's flagship proprietary model, tuned for European enterprise.
Mistral 2025-12-15
GPT-5.2
Late-2025 GPT-5 refresh with improved reasoning and steerability.
OpenAI 2025-12-11
Claude Opus 4.5
Anthropic's top-tier reasoning model for complex research and agents.
Anthropic 2025-11-24
Gemini 3 Pro
First Gemini 3 tier release; strong multimodal + long-context.
Google 2025-11-18
GPT-5.1
Maintenance update to GPT-5 with steerability + latency improvements.
OpenAI 2025-11-12
GPT-5
OpenAI's flagship unified reasoning + chat model replacing the GPT-4 line.
OpenAI 2025-08-15
Claude Opus 4.1
Mid-2025 Opus refresh focused on agentic coding reliability.
Anthropic 2025-07-15
Gemini 2.5 Flash
Google's cost-optimized multimodal model with thinking mode.
Google 2025-06-20
Claude Sonnet 4
Claude 4 mid-tier with strong coding and long-horizon agentic reliability.
Anthropic 2025-05-22
Claude Sonnet 3.7
Extended-thinking update to Sonnet 3.5 with visible reasoning toggle.
Anthropic 2025-02-24
DeepSeek-V3
DeepSeek's breakthrough open-weight MoE rivaling GPT-4-class quality.
DeepSeek 2024-12-26
Gemini 2.0 Flash
First Gemini 2 model — fast, cheap, multimodal, with tool use native.
Google 2024-12-11
Grok-2
Elon's second-gen Grok — real-time X/Twitter data access
xAI 2024-08-13
Claude 3.5 Sonnet
The mid-2024 Sonnet release that set the SOTA bar for coding and agents.
Anthropic 2024-06-20
Claude 3 Opus
Anthropic's most powerful pre-Claude 4 model — tops GPT-4 on reasoning
Anthropic 2024-03-04
Claude 3 Sonnet
Balanced Claude 3 variant — best price/performance in the family
Anthropic 2024-03-04
Claude 3 Haiku
Fastest and cheapest Claude 3 — sub-second latency at $0.25/M
Anthropic 2024-03-04
Mistral Large Mistral 2024-02-26
Gemini 1.5 Pro
Google's first 1M-context model — multimodal needle-in-haystack champion
Google 2024-02-15
text-embedding-3-large
OpenAI's best embedding model — 3× cheaper than ada-002 with better MTEB
OpenAI 2024-01-25
GPT-4 Turbo
GPT-4 with 128K context and knowledge through April 2023 — 3× cheaper than GPT-4
OpenAI 2024-01-25
Grok-1
xAI's open-source release — 314B MoE, first frontier model fully open-sourced
xAI 2023-12-07
AlphaCode 2
DeepMind's coding specialist — top 15% of competitive programmers
Google DeepMind 2023-12-06
Gemini Ultra
Google's first model to beat GPT-4 on MMLU — 90%+ with CoT
Google 2023-12-06
Gemini Pro
Google's workhorse Gemini model — free tier in Google AI Studio
Google 2023-12-06
Claude 2.1
Claude 2 update — 200K context and reduced hallucinations
Anthropic 2023-11-21
Whisper v3
State-of-the-art open speech recognition — 99 languages, open weights
OpenAI 2023-11-06
Code Llama 34B
Meta's code-specialized open model — top open-source coding at launch
Meta 2023-08-24
Llama 2 70B
Meta's open-weight landmark — 70B that matched GPT-3.5 and ignited open AI
Meta 2023-07-18
Claude 2
Claude's first major leap — 100K context and better at instructions
Anthropic 2023-07-11
PaLM 2
Google's multilingual flagship — powers Bard 2023, 100+ languages
Google 2023-05-10
GPT-4
The model that changed everything — GPT-4 set the standard for capable AI
OpenAI 2023-03-14
Claude 1.3
Anthropic's first public model — 100K context ahead of its time
Anthropic 2023-03-14
ChatGPT (GPT-3.5 Turbo)
The product that launched the AI era — 100M users in 2 months
OpenAI 2022-11-30
PaLM
Google's 540B pathways model — first to demonstrate chain-of-thought at scale
Google 2022-04-04
Codex
GPT-3 trained on code — the engine behind GitHub Copilot v1
OpenAI 2021-08-10
GPT-3
The model that showed scaling works — 175B parameters, few-shot learning pioneer
OpenAI 2020-06-11

Daily AI news, condensed

A curated daily roundup of frontier-AI news — model launches, research, deals, and product moves.

Read today's brief →