AI Flash Report

🚀 AI Model Release Timeline

Comprehensive tracker of major AI model releases with specifications and performance metrics

43 model releases tracked | Last updated: 2025-07-15 12:11 UTC

2025

May

Major Release

Claude Sonnet 4

Anthropic
Released: 2025-05-22
Type: LLM
Size: ~500B
Latest generation Claude model with significant performance improvements

✨ Key Features

  • Enhanced reasoning capabilities
  • Improved safety measures
  • Advanced multimodal understanding
  • Extended context window

📊 Performance Metrics

91.2%
MMLU
94.5%
HumanEval
76.8%
MATH

February

Update

Claude Sonnet 3.7

Anthropic
Released: 2025-02-24
Type: LLM
Size: ~300B
Iterative improvement on Claude 3.5 with enhanced capabilities

✨ Key Features

  • Improved reasoning
  • Better code generation
  • Enhanced safety
  • Reduced hallucinations

📊 Performance Metrics

89.5%
MMLU
93.2%
HumanEval
74.1%
MATH

2024

December

Major Release

DeepSeek-V3

DeepSeek
Released: 2024-12-26
Type: LLM
Size: 671B
DeepSeek's most advanced open-source model with MoE architecture

✨ Key Features

  • Mixture of Experts architecture
  • Cost-effective training
  • Open source release
  • Strong reasoning capabilities

📊 Performance Metrics

88.5%
MMLU
90.2%
HumanEval
76.1%
MATH
Major Release

Gemini 2.0 Flash

Google
Released: 2024-12-11
Type: Multimodal
Size: ~175B
Google's next-generation model with native multimodal capabilities

✨ Key Features

  • Native multimodal generation
  • Real-time API
  • Agentic capabilities
  • Enhanced speed

📊 Performance Metrics

85.8%
MMLU
71.9%
HumanEval
58.8%
MATH

October

Major Release

Moonshot Kimi

Moonshot AI
Released: 2024-10-16
Type: LLM
Size: ~100B
Chinese AI company's large language model optimized for Asian languages

✨ Key Features

  • 200K context window
  • Multilingual support
  • Document processing
  • Chinese language optimization

📊 Performance Metrics

77.9%
C-Eval
75.2%
CMMLU
65.8%
HumanEval

September

Major Release

GPT-o1-preview

OpenAI
Released: 2024-09-12
Type: LLM
Size: ~175B
OpenAI's most advanced reasoning model designed for complex problem-solving

✨ Key Features

  • Advanced reasoning capabilities
  • Chain-of-thought processing
  • Enhanced problem-solving
  • Improved mathematical reasoning

📊 Performance Metrics

83rd percentile
AIME
78%
GPQA
83%
MATH

August

Major Release

Grok-2

xAI
Released: 2024-08-13
Type: LLM
Size: ~314B
xAI's flagship model with real-time web access and multimodal capabilities

✨ Key Features

  • Real-time information access
  • Multimodal understanding
  • X platform integration
  • Conversational AI

📊 Performance Metrics

84.0%
MMLU
74.1%
HumanEval
56.0%
MATH
Platform Exclusive
📢 View Announcement

July

Major Release

GitHub Copilot Chat

Microsoft
Released: 2024-07-25
Type: Code
Size: ~70B
AI-powered coding assistant with conversational interface

✨ Key Features

  • Conversational coding
  • IDE integration
  • Code explanation
  • Multi-language support

📊 Performance Metrics

84.2%
HumanEval
78.9%
MBPP
71.4%
CodeT5

June

Major Release

Claude 3.5 Sonnet

Anthropic
Released: 2024-06-20
Type: LLM
Size: ~175B
Anthropic's most intelligent model with significantly improved capabilities

✨ Key Features

  • 200K context window
  • Improved coding capabilities
  • Enhanced reasoning
  • Vision capabilities

📊 Performance Metrics

88.7%
MMLU
92%
HumanEval
71.1%
MATH

March

Update

Grok-1.5

xAI
Released: 2024-03-28
Type: LLM
Size: ~314B
Enhanced version of Grok with improved performance across benchmarks

✨ Key Features

  • Improved reasoning
  • Code generation
  • 128K context window
  • Real-time data access

📊 Performance Metrics

73.0%
MMLU
63.2%
HumanEval
50.6%
MATH
Platform Exclusive
📢 View Announcement
Major Release

Claude 3 Opus

Anthropic
Released: 2024-03-04
Type: LLM
Size: ~175B
Most capable model in the Claude 3 family with near-human performance on complex tasks

✨ Key Features

  • 200K context window
  • Advanced reasoning
  • Multimodal capabilities
  • Constitutional AI training

📊 Performance Metrics

86.8%
MMLU
84.9%
HumanEval
60.1%
MATH
Major Release

Claude 3 Sonnet

Anthropic
Released: 2024-03-04
Type: LLM
Size: ~175B
Balanced model offering strong performance with faster response times

✨ Key Features

  • 200K context window
  • Balanced capability and speed
  • Multimodal input
  • Strong reasoning

📊 Performance Metrics

79.0%
MMLU
73.0%
HumanEval
40.5%
MATH
Major Release

Claude 3 Haiku

Anthropic
Released: 2024-03-04
Type: LLM
Size: ~25B
Fastest and most compact model in the Claude 3 family

✨ Key Features

  • 200K context window
  • Fastest response times
  • Multimodal input
  • Cost-effective

📊 Performance Metrics

75.2%
MMLU
75.9%
HumanEval
38.9%
MATH

February

Major Release

Mistral Large

Mistral
Released: 2024-02-26
Type: LLM
Size: ~70B
Top-tier reasoning model with strong multilingual capabilities

✨ Key Features

  • 32K context window
  • Multilingual capabilities
  • Function calling
  • JSON mode

📊 Performance Metrics

81.2%
MMLU
45%
HumanEval
89.2%
HellaSwag
Major Release

Sora

OpenAI
Released: 2024-02-15
Type: Video
Size: ~1B
OpenAI's text-to-video AI model capable of generating minute-long videos

✨ Key Features

  • Text-to-video generation
  • 60-second video clips
  • Complex scene generation
  • Realistic motion physics

📊 Performance Metrics

High
Video Quality
Excellent
Motion Coherence
95%
Text Adherence
Major Release

Gemini 1.5 Pro

Google
Released: 2024-02-15
Type: Multimodal
Size: ~175B
Google's next-generation model with breakthrough long context capabilities

✨ Key Features

  • 1M token context window
  • Multimodal understanding
  • Video analysis
  • Audio processing

📊 Performance Metrics

85.9%
MMLU
71.9%
HumanEval
58.5%
MATH

January

Major Release

text-embedding-3-large

OpenAI
Released: 2024-01-25
Type: Embedding
Size: ~7B
OpenAI's most powerful text embedding model

✨ Key Features

  • 3072 embedding dimensions
  • Improved retrieval performance
  • Reduced hallucinations
  • Multi-language support

📊 Performance Metrics

64.6%
MTEB Score
3072
Dimensions
100+
Languages
Major Release

GPT-4 Turbo

OpenAI
Released: 2024-01-25
Type: LLM
Size: ~175B
Latest iteration of GPT-4 with improved performance and longer context window

✨ Key Features

  • 128K context window
  • Improved instruction following
  • Enhanced reasoning capabilities
  • Reduced hallucinations

📊 Performance Metrics

86.4%
MMLU
67%
HumanEval
95.3%
HellaSwag

2023

December

Major Release

Midjourney v6

Midjourney
Released: 2023-12-21
Type: Vision
Size: ~5B
Midjourney's latest high-quality image generation model

✨ Key Features

  • Photorealistic image generation
  • Improved prompt understanding
  • Better human anatomy
  • Text rendering

📊 Performance Metrics

Very High
Image Quality
92%
Prompt Adherence
Excellent
Style Versatility
Major Release

Grok-1

xAI
Released: 2023-12-07
Type: LLM
Size: ~314B
xAI's first major language model with real-time internet access

✨ Key Features

  • Real-time information
  • Conversational interface
  • X platform integration
  • Uncensored responses

📊 Performance Metrics

73.0%
MMLU
63.2%
HumanEval
62.9%
GSM8K
Platform Exclusive
📢 View Announcement
Research

AlphaCode 2

Google DeepMind
Released: 2023-12-06
Type: Code
Size: ~340B
DeepMind's advanced code generation system for competitive programming

✨ Key Features

  • Advanced code generation
  • Competitive programming
  • Multi-language support
  • Problem decomposition

📊 Performance Metrics

1747
Codeforces Rating
85th percentile
Problem Solving
10+ languages
Language Support
Major Release

Gemini Ultra

Google
Released: 2023-12-06
Type: Multimodal
Size: ~540B
Google's most capable multimodal AI model

✨ Key Features

  • Multimodal reasoning
  • Text, image, audio, video understanding
  • Advanced mathematical reasoning
  • Code generation

📊 Performance Metrics

90.0%
MMLU
74.4%
HumanEval
53.2%
MATH
Major Release

Gemini Pro

Google
Released: 2023-12-06
Type: Multimodal
Size: ~175B
Google's balanced model for wide range of tasks

✨ Key Features

  • Multimodal capabilities
  • 32K context window
  • Fast inference
  • Scalable deployment

📊 Performance Metrics

83.7%
MMLU
67.7%
HumanEval
32.6%
MATH

November

Update

Claude 2.1

Anthropic
Released: 2023-11-21
Type: LLM
Size: ~175B
Significant improvements in accuracy and honesty over Claude 2

✨ Key Features

  • 200K context window
  • Reduced hallucination rates
  • Enhanced accuracy
  • Tool use capabilities

📊 Performance Metrics

88.0%
MMLU
71.2%
HumanEval
71.1%
MATH
Major Release

Whisper v3

OpenAI
Released: 2023-11-06
Type: Audio
Size: ~1.55B
OpenAI's multilingual speech recognition system

✨ Key Features

  • Multilingual speech recognition
  • 99 language support
  • Robust noise handling
  • Real-time transcription

📊 Performance Metrics

5.1%
WER English
99 languages
Language Coverage
0.8x
Real-time Factor

October

Major Release

DALL-E 3

OpenAI
Released: 2023-10-04
Type: Vision
Size: ~6B
Latest iteration of DALL-E with significantly improved image generation

✨ Key Features

  • Improved prompt adherence
  • Higher resolution outputs
  • Better text rendering
  • Enhanced artistic styles

📊 Performance Metrics

High
Image Quality
90%
Prompt Adherence
99%
Safety Filtering

September

Major Release

GPT-4V (Vision)

OpenAI
Released: 2023-09-25
Type: Multimodal
Size: ~175B
GPT-4 enhanced with vision capabilities for image understanding

✨ Key Features

  • Image understanding
  • Visual reasoning
  • Multimodal capabilities
  • Enhanced context understanding

📊 Performance Metrics

86.4%
MMLU
49.9%
MathVista
78.2%
AI2D

August

Major Release

Code Llama 34B

Meta
Released: 2023-08-24
Type: Code
Size: 34B
Specialized model for code generation built on Llama 2

✨ Key Features

  • Code generation
  • Code completion
  • Multiple programming languages
  • Large context window

📊 Performance Metrics

48.4%
HumanEval
55.0%
MBPP
45.9%
MultiPL-E

July

Research

RT-2 (Robotics Transformer)

Google
Released: 2023-07-28
Type: Robotics
Size: ~55B
Google's vision-language-action model for robotic control

✨ Key Features

  • Vision-language-action model
  • Robotic control
  • Web-scale training
  • Emergent skills

📊 Performance Metrics

62%
Success Rate
3x improvement
Novel Tasks
High
Emergent Skills
Major Release

Stable Diffusion XL

Stability AI
Released: 2023-07-26
Type: Vision
Size: ~3.5B
High-resolution text-to-image model with improved quality

✨ Key Features

  • 1024x1024 native resolution
  • Improved image quality
  • Better text rendering
  • Fine-tunable

📊 Performance Metrics

1024x1024
Resolution
23.4
FID Score
31.8
CLIP Score
Major Release

Llama 2 70B

Meta
Released: 2023-07-18
Type: LLM
Size: 70B
Meta's open-source large language model with commercial license

✨ Key Features

  • Open source
  • Commercial license
  • Improved safety
  • Enhanced performance

📊 Performance Metrics

68.9%
MMLU
29.9%
HumanEval
13.5%
MATH
Major Release

Claude 2

Anthropic
Released: 2023-07-11
Type: LLM
Size: ~175B
Significant improvement over Claude 1 with enhanced capabilities

✨ Key Features

  • 100K context window
  • Improved safety
  • Enhanced reasoning
  • Better code generation

📊 Performance Metrics

78.5%
MMLU
71.2%
HumanEval
88.0%
MATH

May

Major Release

PaLM 2

Google
Released: 2023-05-10
Type: LLM
Size: ~340B
Google's improved large language model powering Bard and other services

✨ Key Features

  • Multilingual capabilities
  • Reasoning improvements
  • Coding abilities
  • Multiple model sizes

📊 Performance Metrics

86.4%
MMLU
77.2%
HumanEval
34.3%
MATH

March

Major Release

GPT-4

OpenAI
Released: 2023-03-14
Type: LLM
Size: ~175B
OpenAI's most advanced system producing safer and more useful responses

✨ Key Features

  • 8K context window
  • Multimodal capabilities
  • Enhanced reasoning
  • Improved factual accuracy

📊 Performance Metrics

86.4%
MMLU
67.0%
HumanEval
42.5%
MATH
Major Release

Claude 1.3

Anthropic
Released: 2023-03-14
Type: LLM
Size: ~52B
Anthropic's AI assistant built using Constitutional AI methods

✨ Key Features

  • Constitutional AI
  • Helpful and harmless
  • Long conversations
  • Improved reasoning

📊 Performance Metrics

77.0%
MMLU
56.0%
HumanEval
36.0%
MATH

January

Research

MusicLM

Google
Released: 2023-01-26
Type: Audio
Size: ~1.2B
Google's text-to-music generation system

✨ Key Features

  • Text-to-music generation
  • Hierarchical sequence modeling
  • Long-form music generation
  • Style control

📊 Performance Metrics

24 kHz
Audio Quality
Several minutes
Generation Length
High
Style Adherence

2022

November

Major Release

ChatGPT (GPT-3.5 Turbo)

OpenAI
Released: 2022-11-30
Type: LLM
Size: ~175B
Conversational AI that sparked mainstream adoption of large language models

✨ Key Features

  • Conversational interface
  • Fine-tuned for chat
  • RLHF training
  • Fast response times

📊 Performance Metrics

70.0%
MMLU
48.1%
HumanEval
34.1%
MATH
Free + Paid Tiers
📢 View Announcement

April

Major Release

DALL-E 2

OpenAI
Released: 2022-04-06
Type: Vision
Size: ~3.5B
AI system that creates realistic images and art from natural language descriptions

✨ Key Features

  • Text-to-image generation
  • High-resolution outputs
  • Style transfer
  • Inpainting capabilities

📊 Performance Metrics

7.27
FID
35.7
CLIP Score
22.7
IS
Research

PaLM

Google
Released: 2022-04-04
Type: LLM
Size: 540B
Google's 540-billion parameter language model demonstrating breakthrough capabilities

✨ Key Features

  • Large parameter count
  • Few-shot learning
  • Reasoning capabilities
  • Code generation

📊 Performance Metrics

69.3%
MMLU
26.2%
HumanEval
8.8%
MATH

January

Research

InstructGPT

OpenAI
Released: 2022-01-27
Type: LLM
Size: ~175B
GPT-3 fine-tuned to follow instructions using human feedback

✨ Key Features

  • RLHF training
  • Instruction following
  • Reduced harmful outputs
  • Better alignment

📊 Performance Metrics

58.0%
TruthfulQA
0.3%
RealToxicityPrompts
85.0%
Helpfulness

2021

August

Major Release

Codex

OpenAI
Released: 2021-08-10
Type: Code
Size: ~12B
AI system that translates natural language to code

✨ Key Features

  • Code generation
  • Natural language to code
  • Multiple programming languages
  • GitHub Copilot integration

📊 Performance Metrics

28.8%
HumanEval
59.6%
MBPP
25.0%
APPS

January

Research

DALL-E

OpenAI
Released: 2021-01-05
Type: Vision
Size: ~12B
Neural network that creates images from text captions

✨ Key Features

  • Text-to-image generation
  • Creative image synthesis
  • Concept combination
  • Style control

📊 Performance Metrics

17.89
FID
17.9
IS
32.5
CLIP Score

2020

June

Major Release

GPT-3

OpenAI
Released: 2020-06-11
Type: LLM
Size: 175B
Breakthrough large language model that demonstrated emergent capabilities

✨ Key Features

  • 175 billion parameters
  • Few-shot learning
  • Text generation
  • Multiple capabilities

📊 Performance Metrics

43.9%
MMLU
0.0%
HumanEval
5.2%
MATH