🚀 AI Model Release Timeline

Comprehensive tracker of major AI model releases with specifications and performance metrics

59 model releases tracked | Last updated: 2026-02-23 12:04 UTC

2026

February

Major Release

Gemini 3.1 Pro

Google
Released: 2026-02-19
Type: LLM
Size: ~1T MoE
Google's latest flagship model with a major 2x jump in reasoning capabilities

✨ Key Features

  • 2x reasoning improvement
  • ARC-AGI-2 score of 77.1%
  • Enhanced multimodal understanding
  • Deep Think mode

📊 Performance Metrics

77.1%
ARC-AGI-2
93.8%
MMLU
89.4%
MATH
Major Release

Claude Sonnet 4.6

Anthropic
Released: 2026-02-17
Type: LLM
Size: ~500B
Anthropic's latest Sonnet with Agent Teams capability and near-Opus performance at a fraction of the cost

✨ Key Features

  • Agent Teams: orchestrate 2-16 Claude instances
  • Near-Opus performance at 1/5th cost
  • 80.8% SWE-bench Verified
  • Fast mode research preview

📊 Performance Metrics

80.8%
SWE-bench
92.1%
MMLU
95.2%
HumanEval
Update

DeepSeek V3.2

DeepSeek
Released: 2026-02-12
Type: LLM
Size: 671B MoE
Major update with 10x context window expansion to over 1 million tokens

✨ Key Features

  • 1M+ token context window (10x expansion)
  • Improved reasoning capabilities
  • Open source release
  • Cost-effective inference

📊 Performance Metrics

90.1%
MMLU
92.5%
HumanEval
1M+ tokens
Context Window
Major Release

GLM-5

Zhipu AI
Released: 2026-02-11
Type: LLM
Size: 744B
First frontier AI model trained entirely without NVIDIA GPUs, using Huawei Ascend chips

✨ Key Features

  • First frontier model trained on Huawei Ascend chips (no NVIDIA)
  • #1 HLE score (50.4%)
  • 1.2% hallucination rate via Slime RL
  • 136x cheaper than Claude Opus 4.5

📊 Performance Metrics

50.4%
HLE
1.2%
Hallucination Rate
$0.11/M tokens
Cost
Major Release

Seedance 2.0

ByteDance
Released: 2026-02-10
Type: Video
Size: ~10B
First commercial video model generating synchronized audio and video in a single pass

✨ Key Features

  • Synchronized audio + video generation in one pass
  • Lip-sync in 8+ languages
  • 10-30x cheaper than Sora 2
  • Commercial pricing: $0.10-$0.80/min

📊 Performance Metrics

Excellent
Audio Sync
8+
Languages
$0.10-$0.80/min
Cost
Major Release

GPT-5.3 Codex

OpenAI
Released: 2026-02-05
Type: Code
Size: ~200B
OpenAI's specialized self-improving coding model with state-of-the-art software engineering performance

✨ Key Features

  • Self-improving agentic coding
  • 25% faster than GPT-5.2-Codex
  • 1,000+ tokens/sec generation
  • First OpenAI model flagged 'high' on cybersecurity framework

📊 Performance Metrics

77.3%
Terminal-Bench
SOTA
SWE-Bench Pro
1,000+ tok/s
Speed

January

Major Release

Kimi K2

Moonshot AI
Released: 2026-01-20
Type: LLM
Size: 1.04T MoE
First open-weight model to rank #1 on LMSYS Chatbot Arena with over 1 trillion parameters

✨ Key Features

  • First open-weight model #1 on LMSYS Chatbot Arena
  • 1.04 trillion parameters
  • K2.5 agent swarms with up to 100 sub-agents
  • $0.15/M input tokens

📊 Performance Metrics

#1
LMSYS Arena
1.04T
Parameters
$0.15/M tokens
Cost

2025

December

Update

GPT-5.2 Codex

OpenAI
Released: 2025-12-18
Type: Code
Size: ~200B
Specialized coding variant of GPT-5.2 focused on software engineering tasks

✨ Key Features

  • Specialized for software engineering
  • Enhanced agentic coding
  • Multi-file refactoring
  • Advanced debugging capabilities

📊 Performance Metrics

SOTA
SWE-Bench
96.1%
HumanEval
72.8%
Terminal-Bench
Major Release

Mistral Large 3

Mistral
Released: 2025-12-15
Type: LLM
Size: ~123B
Mistral's flagship model competing with GPT-5 class models at a fraction of the cost

✨ Key Features

  • 128K context window
  • Improved multilingual capabilities
  • Enhanced function calling
  • Competitive with GPT-5 class models

📊 Performance Metrics

88.2%
MMLU
82.5%
HumanEval
72.1%
MATH
Update

GPT-5.2

OpenAI
Released: 2025-12-11
Type: LLM
Size: ~200B
Iterative improvement on GPT-5.1 with enhanced reasoning and faster performance

✨ Key Features

  • Enhanced reasoning capabilities
  • Improved adaptive reasoning
  • Better multimodal understanding
  • Faster inference

📊 Performance Metrics

92.8%
MMLU
88.5%
MATH
95.8%
HumanEval

November

Major Release

Claude Opus 4.5

Anthropic
Released: 2025-11-24
Type: LLM
Size: ~500B
Anthropic's most capable model with breakthrough coding performance and major price reduction

✨ Key Features

  • First model to break 80.9% on SWE-Bench Verified
  • 67% price reduction vs previous Opus
  • Extended reasoning capabilities
  • Advanced coding performance

📊 Performance Metrics

80.9%
SWE-bench
91.5%
MMLU
95.0%
HumanEval
Major Release

Gemini 3 Pro

Google
Released: 2025-11-18
Type: Multimodal
Size: ~1T MoE
Google's flagship model with Deep Think mode, ranked #1 on LMSYS Arena at launch

✨ Key Features

  • 1M token context window
  • Deep Think reasoning mode
  • Solved 5/6 IMO 2025 problems
  • #1 on LMSYS Arena

📊 Performance Metrics

87.5%
ARC-AGI
93.2%
MMLU
#1
LMSYS Arena
Major Release

GPT-5.1

OpenAI
Released: 2025-11-12
Type: LLM
Size: ~200B
Major GPT-5 iteration with adaptive reasoning and perfect scores on math competitions

✨ Key Features

  • Adaptive reasoning modes
  • Perfect 100% on AIME 2025
  • 87.5% on ARC-AGI
  • Enhanced multimodal capabilities

📊 Performance Metrics

87.5%
ARC-AGI
100%
AIME 2025
92.5%
MMLU

August

Major Release

GPT-5

OpenAI
Released: 2025-08-15
Type: LLM
Size: ~200B
OpenAI's next-generation flagship model with adaptive reasoning capabilities

✨ Key Features

  • Adaptive reasoning (routes between quick and deep thinking)
  • Improved math and coding
  • Enhanced multimodal reasoning
  • New safety architecture

📊 Performance Metrics

91.0%
MMLU
93.5%
HumanEval
85.0%
MATH

July

Update

Claude Opus 4.1

Anthropic
Released: 2025-07-15
Type: LLM
Size: ~500B
Iterative improvement on Claude Opus 4 with enhanced multi-file refactoring

✨ Key Features

  • Improved multi-file refactoring
  • Enhanced agentic capabilities
  • Better long-context performance
  • Reduced hallucinations

📊 Performance Metrics

75.2%
SWE-bench
90.8%
MMLU
94.0%
HumanEval

June

Update

Gemini 2.5 Flash

Google
Released: 2025-06-20
Type: Multimodal
Size: ~175B
Google's fast and cost-effective model with enhanced image capabilities

✨ Key Features

  • Enhanced image editing stabilization
  • Faster inference
  • Improved multimodal understanding
  • Cost-effective deployment

📊 Performance Metrics

87.5%
MMLU
2x Gemini 2.0 Flash
Speed
High
Image Quality

May

Major Release

Claude Sonnet 4

Anthropic
Released: 2025-05-22
Type: LLM
Size: ~500B
Latest generation Claude model with significant performance improvements

✨ Key Features

  • Enhanced reasoning capabilities
  • Improved safety measures
  • Advanced multimodal understanding
  • Extended context window

📊 Performance Metrics

91.2%
MMLU
94.5%
HumanEval
76.8%
MATH

February

Update

Claude Sonnet 3.7

Anthropic
Released: 2025-02-24
Type: LLM
Size: ~300B
Iterative improvement on Claude 3.5 with enhanced capabilities

✨ Key Features

  • Improved reasoning
  • Better code generation
  • Enhanced safety
  • Reduced hallucinations

📊 Performance Metrics

89.5%
MMLU
93.2%
HumanEval
74.1%
MATH

2024

December

Major Release

DeepSeek-V3

DeepSeek
Released: 2024-12-26
Type: LLM
Size: 671B
DeepSeek's most advanced open-source model with MoE architecture

✨ Key Features

  • Mixture of Experts architecture
  • Cost-effective training
  • Open source release
  • Strong reasoning capabilities

📊 Performance Metrics

88.5%
MMLU
90.2%
HumanEval
76.1%
MATH
Major Release

Gemini 2.0 Flash

Google
Released: 2024-12-11
Type: Multimodal
Size: ~175B
Google's next-generation model with native multimodal capabilities

✨ Key Features

  • Native multimodal generation
  • Real-time API
  • Agentic capabilities
  • Enhanced speed

📊 Performance Metrics

85.8%
MMLU
71.9%
HumanEval
58.8%
MATH

October

Major Release

Moonshot Kimi

Moonshot AI
Released: 2024-10-16
Type: LLM
Size: ~100B
Chinese AI company's large language model optimized for Asian languages

✨ Key Features

  • 200K context window
  • Multilingual support
  • Document processing
  • Chinese language optimization

📊 Performance Metrics

77.9%
C-Eval
75.2%
CMMLU
65.8%
HumanEval

September

Major Release

GPT-o1-preview

OpenAI
Released: 2024-09-12
Type: LLM
Size: ~175B
OpenAI's most advanced reasoning model designed for complex problem-solving

✨ Key Features

  • Advanced reasoning capabilities
  • Chain-of-thought processing
  • Enhanced problem-solving
  • Improved mathematical reasoning

📊 Performance Metrics

83rd percentile
AIME
78%
GPQA
83%
MATH

August

Major Release

Grok-2

xAI
Released: 2024-08-13
Type: LLM
Size: ~314B
xAI's flagship model with real-time web access and multimodal capabilities

✨ Key Features

  • Real-time information access
  • Multimodal understanding
  • X platform integration
  • Conversational AI

📊 Performance Metrics

84.0%
MMLU
74.1%
HumanEval
56.0%
MATH
Platform Exclusive
📢 View Announcement

July

Major Release

GitHub Copilot Chat

Microsoft
Released: 2024-07-25
Type: Code
Size: ~70B
AI-powered coding assistant with conversational interface

✨ Key Features

  • Conversational coding
  • IDE integration
  • Code explanation
  • Multi-language support

📊 Performance Metrics

84.2%
HumanEval
78.9%
MBPP
71.4%
CodeT5

June

Major Release

Claude 3.5 Sonnet

Anthropic
Released: 2024-06-20
Type: LLM
Size: ~175B
Anthropic's most intelligent model with significantly improved capabilities

✨ Key Features

  • 200K context window
  • Improved coding capabilities
  • Enhanced reasoning
  • Vision capabilities

📊 Performance Metrics

88.7%
MMLU
92%
HumanEval
71.1%
MATH

March

Update

Grok-1.5

xAI
Released: 2024-03-28
Type: LLM
Size: ~314B
Enhanced version of Grok with improved performance across benchmarks

✨ Key Features

  • Improved reasoning
  • Code generation
  • 128K context window
  • Real-time data access

📊 Performance Metrics

73.0%
MMLU
63.2%
HumanEval
50.6%
MATH
Platform Exclusive
📢 View Announcement
Major Release

Claude 3 Opus

Anthropic
Released: 2024-03-04
Type: LLM
Size: ~175B
Most capable model in the Claude 3 family with near-human performance on complex tasks

✨ Key Features

  • 200K context window
  • Advanced reasoning
  • Multimodal capabilities
  • Constitutional AI training

📊 Performance Metrics

86.8%
MMLU
84.9%
HumanEval
60.1%
MATH
Major Release

Claude 3 Sonnet

Anthropic
Released: 2024-03-04
Type: LLM
Size: ~175B
Balanced model offering strong performance with faster response times

✨ Key Features

  • 200K context window
  • Balanced capability and speed
  • Multimodal input
  • Strong reasoning

📊 Performance Metrics

79.0%
MMLU
73.0%
HumanEval
40.5%
MATH
Major Release

Claude 3 Haiku

Anthropic
Released: 2024-03-04
Type: LLM
Size: ~25B
Fastest and most compact model in the Claude 3 family

✨ Key Features

  • 200K context window
  • Fastest response times
  • Multimodal input
  • Cost-effective

📊 Performance Metrics

75.2%
MMLU
75.9%
HumanEval
38.9%
MATH

February

Major Release

Mistral Large

Mistral
Released: 2024-02-26
Type: LLM
Size: ~70B
Top-tier reasoning model with strong multilingual capabilities

✨ Key Features

  • 32K context window
  • Multilingual capabilities
  • Function calling
  • JSON mode

📊 Performance Metrics

81.2%
MMLU
45%
HumanEval
89.2%
HellaSwag
Major Release

Sora

OpenAI
Released: 2024-02-15
Type: Video
Size: ~1B
OpenAI's text-to-video AI model capable of generating minute-long videos

✨ Key Features

  • Text-to-video generation
  • 60-second video clips
  • Complex scene generation
  • Realistic motion physics

📊 Performance Metrics

High
Video Quality
Excellent
Motion Coherence
95%
Text Adherence
Major Release

Gemini 1.5 Pro

Google
Released: 2024-02-15
Type: Multimodal
Size: ~175B
Google's next-generation model with breakthrough long context capabilities

✨ Key Features

  • 1M token context window
  • Multimodal understanding
  • Video analysis
  • Audio processing

📊 Performance Metrics

85.9%
MMLU
71.9%
HumanEval
58.5%
MATH

January

Major Release

text-embedding-3-large

OpenAI
Released: 2024-01-25
Type: Embedding
Size: ~7B
OpenAI's most powerful text embedding model

✨ Key Features

  • 3072 embedding dimensions
  • Improved retrieval performance
  • Reduced hallucinations
  • Multi-language support

📊 Performance Metrics

64.6%
MTEB Score
3072
Dimensions
100+
Languages
Major Release

GPT-4 Turbo

OpenAI
Released: 2024-01-25
Type: LLM
Size: ~175B
Latest iteration of GPT-4 with improved performance and longer context window

✨ Key Features

  • 128K context window
  • Improved instruction following
  • Enhanced reasoning capabilities
  • Reduced hallucinations

📊 Performance Metrics

86.4%
MMLU
67%
HumanEval
95.3%
HellaSwag

2023

December

Major Release

Midjourney v6

Midjourney
Released: 2023-12-21
Type: Vision
Size: ~5B
Midjourney's latest high-quality image generation model

✨ Key Features

  • Photorealistic image generation
  • Improved prompt understanding
  • Better human anatomy
  • Text rendering

📊 Performance Metrics

Very High
Image Quality
92%
Prompt Adherence
Excellent
Style Versatility
Major Release

Grok-1

xAI
Released: 2023-12-07
Type: LLM
Size: ~314B
xAI's first major language model with real-time internet access

✨ Key Features

  • Real-time information
  • Conversational interface
  • X platform integration
  • Uncensored responses

📊 Performance Metrics

73.0%
MMLU
63.2%
HumanEval
62.9%
GSM8K
Platform Exclusive
📢 View Announcement
Research

AlphaCode 2

Google DeepMind
Released: 2023-12-06
Type: Code
Size: ~340B
DeepMind's advanced code generation system for competitive programming

✨ Key Features

  • Advanced code generation
  • Competitive programming
  • Multi-language support
  • Problem decomposition

📊 Performance Metrics

1747
Codeforces Rating
85th percentile
Problem Solving
10+ languages
Language Support
Major Release

Gemini Ultra

Google
Released: 2023-12-06
Type: Multimodal
Size: ~540B
Google's most capable multimodal AI model

✨ Key Features

  • Multimodal reasoning
  • Text, image, audio, video understanding
  • Advanced mathematical reasoning
  • Code generation

📊 Performance Metrics

90.0%
MMLU
74.4%
HumanEval
53.2%
MATH
Major Release

Gemini Pro

Google
Released: 2023-12-06
Type: Multimodal
Size: ~175B
Google's balanced model for wide range of tasks

✨ Key Features

  • Multimodal capabilities
  • 32K context window
  • Fast inference
  • Scalable deployment

📊 Performance Metrics

83.7%
MMLU
67.7%
HumanEval
32.6%
MATH

November

Update

Claude 2.1

Anthropic
Released: 2023-11-21
Type: LLM
Size: ~175B
Significant improvements in accuracy and honesty over Claude 2

✨ Key Features

  • 200K context window
  • Reduced hallucination rates
  • Enhanced accuracy
  • Tool use capabilities

📊 Performance Metrics

88.0%
MMLU
71.2%
HumanEval
71.1%
MATH
Major Release

Whisper v3

OpenAI
Released: 2023-11-06
Type: Audio
Size: ~1.55B
OpenAI's multilingual speech recognition system

✨ Key Features

  • Multilingual speech recognition
  • 99 language support
  • Robust noise handling
  • Real-time transcription

📊 Performance Metrics

5.1%
WER English
99 languages
Language Coverage
0.8x
Real-time Factor

October

Major Release

DALL-E 3

OpenAI
Released: 2023-10-04
Type: Vision
Size: ~6B
Latest iteration of DALL-E with significantly improved image generation

✨ Key Features

  • Improved prompt adherence
  • Higher resolution outputs
  • Better text rendering
  • Enhanced artistic styles

📊 Performance Metrics

High
Image Quality
90%
Prompt Adherence
99%
Safety Filtering

September

Major Release

GPT-4V (Vision)

OpenAI
Released: 2023-09-25
Type: Multimodal
Size: ~175B
GPT-4 enhanced with vision capabilities for image understanding

✨ Key Features

  • Image understanding
  • Visual reasoning
  • Multimodal capabilities
  • Enhanced context understanding

📊 Performance Metrics

86.4%
MMLU
49.9%
MathVista
78.2%
AI2D

August

Major Release

Code Llama 34B

Meta
Released: 2023-08-24
Type: Code
Size: 34B
Specialized model for code generation built on Llama 2

✨ Key Features

  • Code generation
  • Code completion
  • Multiple programming languages
  • Large context window

📊 Performance Metrics

48.4%
HumanEval
55.0%
MBPP
45.9%
MultiPL-E

July

Research

RT-2 (Robotics Transformer)

Google
Released: 2023-07-28
Type: Robotics
Size: ~55B
Google's vision-language-action model for robotic control

✨ Key Features

  • Vision-language-action model
  • Robotic control
  • Web-scale training
  • Emergent skills

📊 Performance Metrics

62%
Success Rate
3x improvement
Novel Tasks
High
Emergent Skills
Major Release

Stable Diffusion XL

Stability AI
Released: 2023-07-26
Type: Vision
Size: ~3.5B
High-resolution text-to-image model with improved quality

✨ Key Features

  • 1024x1024 native resolution
  • Improved image quality
  • Better text rendering
  • Fine-tunable

📊 Performance Metrics

1024x1024
Resolution
23.4
FID Score
31.8
CLIP Score
Major Release

Llama 2 70B

Meta
Released: 2023-07-18
Type: LLM
Size: 70B
Meta's open-source large language model with commercial license

✨ Key Features

  • Open source
  • Commercial license
  • Improved safety
  • Enhanced performance

📊 Performance Metrics

68.9%
MMLU
29.9%
HumanEval
13.5%
MATH
Major Release

Claude 2

Anthropic
Released: 2023-07-11
Type: LLM
Size: ~175B
Significant improvement over Claude 1 with enhanced capabilities

✨ Key Features

  • 100K context window
  • Improved safety
  • Enhanced reasoning
  • Better code generation

📊 Performance Metrics

78.5%
MMLU
71.2%
HumanEval
88.0%
MATH

May

Major Release

PaLM 2

Google
Released: 2023-05-10
Type: LLM
Size: ~340B
Google's improved large language model powering Bard and other services

✨ Key Features

  • Multilingual capabilities
  • Reasoning improvements
  • Coding abilities
  • Multiple model sizes

📊 Performance Metrics

86.4%
MMLU
77.2%
HumanEval
34.3%
MATH

March

Major Release

GPT-4

OpenAI
Released: 2023-03-14
Type: LLM
Size: ~175B
OpenAI's most advanced system producing safer and more useful responses

✨ Key Features

  • 8K context window
  • Multimodal capabilities
  • Enhanced reasoning
  • Improved factual accuracy

📊 Performance Metrics

86.4%
MMLU
67.0%
HumanEval
42.5%
MATH
Major Release

Claude 1.3

Anthropic
Released: 2023-03-14
Type: LLM
Size: ~52B
Anthropic's AI assistant built using Constitutional AI methods

✨ Key Features

  • Constitutional AI
  • Helpful and harmless
  • Long conversations
  • Improved reasoning

📊 Performance Metrics

77.0%
MMLU
56.0%
HumanEval
36.0%
MATH

January

Research

MusicLM

Google
Released: 2023-01-26
Type: Audio
Size: ~1.2B
Google's text-to-music generation system

✨ Key Features

  • Text-to-music generation
  • Hierarchical sequence modeling
  • Long-form music generation
  • Style control

📊 Performance Metrics

24 kHz
Audio Quality
Several minutes
Generation Length
High
Style Adherence

2022

November

Major Release

ChatGPT (GPT-3.5 Turbo)

OpenAI
Released: 2022-11-30
Type: LLM
Size: ~175B
Conversational AI that sparked mainstream adoption of large language models

✨ Key Features

  • Conversational interface
  • Fine-tuned for chat
  • RLHF training
  • Fast response times

📊 Performance Metrics

70.0%
MMLU
48.1%
HumanEval
34.1%
MATH
Free + Paid Tiers
📢 View Announcement

April

Major Release

DALL-E 2

OpenAI
Released: 2022-04-06
Type: Vision
Size: ~3.5B
AI system that creates realistic images and art from natural language descriptions

✨ Key Features

  • Text-to-image generation
  • High-resolution outputs
  • Style transfer
  • Inpainting capabilities

📊 Performance Metrics

7.27
FID
35.7
CLIP Score
22.7
IS
Research

PaLM

Google
Released: 2022-04-04
Type: LLM
Size: 540B
Google's 540-billion parameter language model demonstrating breakthrough capabilities

✨ Key Features

  • Large parameter count
  • Few-shot learning
  • Reasoning capabilities
  • Code generation

📊 Performance Metrics

69.3%
MMLU
26.2%
HumanEval
8.8%
MATH

January

Research

InstructGPT

OpenAI
Released: 2022-01-27
Type: LLM
Size: ~175B
GPT-3 fine-tuned to follow instructions using human feedback

✨ Key Features

  • RLHF training
  • Instruction following
  • Reduced harmful outputs
  • Better alignment

📊 Performance Metrics

58.0%
TruthfulQA
0.3%
RealToxicityPrompts
85.0%
Helpfulness

2021

August

Major Release

Codex

OpenAI
Released: 2021-08-10
Type: Code
Size: ~12B
AI system that translates natural language to code

✨ Key Features

  • Code generation
  • Natural language to code
  • Multiple programming languages
  • GitHub Copilot integration

📊 Performance Metrics

28.8%
HumanEval
59.6%
MBPP
25.0%
APPS

January

Research

DALL-E

OpenAI
Released: 2021-01-05
Type: Vision
Size: ~12B
Neural network that creates images from text captions

✨ Key Features

  • Text-to-image generation
  • Creative image synthesis
  • Concept combination
  • Style control

📊 Performance Metrics

17.89
FID
17.9
IS
32.5
CLIP Score

2020

June

Major Release

GPT-3

OpenAI
Released: 2020-06-11
Type: LLM
Size: 175B
Breakthrough large language model that demonstrated emergent capabilities

✨ Key Features

  • 175 billion parameters
  • Few-shot learning
  • Text generation
  • Multiple capabilities

📊 Performance Metrics

43.9%
MMLU
0.0%
HumanEval
5.2%
MATH