AI Flash Report

LiveCodeBench leaderboard

LiveCodeBench is a contamination-free coding benchmark composed of programming problems collected after model training cut-offs.

99 models ranked, highest score first.

LiveCodeBench leaderboard — 99 models ranked by score
# Model Company Score
1 GPT-5.2 OpenAI 88.9%
2 Claude Opus 4.5 Anthropic 87.1%
3 GPT-5.1 OpenAI 86.8%
4 DeepSeek V3.2 DeepSeek 86.2%
5 GPT-5.3 Codex OpenAI 84.2%
6 GPT-5.2 Codex OpenAI 80.4%
7 Gemini 3.1 Pro Google 78.9%
8 GPT-5.5 OpenAI 74.3%
9 MiniMax-M3 MiniMax 74.0%
10 GPT-5.4 OpenAI 74.0%
11 MiMo-V2.5-Pro Xiaomi 73.3%
12 Gemini 3.1 Pro Preview Google 72.7%
13 Claude Opus 4.7 Anthropic 70.3%
14 Qwen3.6 Max Preview Alibaba 69.7%
15 Kimi K2.6 Kimi 69.7%
16 Muse Spark Meta 69.7%
17 Qwen3.6 Plus Alibaba 69.7%
18 Gemini 2.5 Flash Google 69.5%
19 GPT-5.4 mini OpenAI 69.3%
20 Qwen3.7 Max Alibaba 69.0%
21 Kimi K2 Moonshot AI 68.9%
22 Qwen3.6 27B Alibaba 68.7%
23 MiniMax-M2.7 MiniMax 68.7%
24 Claude Opus 4.8 Anthropic 67.7%
25 Qwen3.5 27B Alibaba 67.3%
26 Nemotron 3 Ultra 550B A55B NVIDIA 67.0%
27 MiMo-V2-Omni Xiaomi 66.7%
28 Qwen3.5 122B A10B Alibaba 66.7%
29 DeepSeek V4 Pro DeepSeek 66.3%
30 GPT-5.4 nano OpenAI 66.0%
31 Qwen3.5 397B A17B Alibaba 65.7%
32 Gemini 3.1 Flash-Lite Preview Google 65.3%
33 Qwen3.7 Plus Alibaba 65.0%
34 Ring-2.6-1T InclusionAI 64.3%
35 Grok 4.3 xAI 64.3%
36 Step 3.7 Flash StepFun 63.7%
37 Qwen3.6 35B A3B Alibaba 63.7%
38 MiMo-V2-Omni-0327 Xiaomi 63.7%
39 DeepSeek V4 Flash DeepSeek 63.0%
40 MiMo-V2.5 Xiaomi 62.7%
41 Qwen3.5 35B A3B Alibaba 62.7%
42 GLM-5.1 Z AI 62.3%
43 Gemma 4 31B Google 62.0%
44 Mistral Medium 3.5 Mistral 61.0%
45 GLM 5V Turbo Z AI 61.0%
46 MiMo-V2-Pro Xiaomi 60.7%
47 GLM-5-Turbo Z AI 60.7%
48 NVIDIA Nemotron 3 Super 120B A12B NVIDIA 60.0%
49 Grok 4.20 0309 xAI 59.0%
50 Qwen3.5 9B Alibaba 59.0%
51 Claude Sonnet 4.6 Anthropic 58.7%
52 Grok 4.20 0309 v2 xAI 58.0%
53 GPT-5 OpenAI 55.8%
54 GPT-5.5 Instant OpenAI 55.7%
55 Gemma 4 26B A4B Google 55.7%
56 Qwen3.5 4B Alibaba 55.7%
57 Gemma 4 12B Google 55.3%
58 JT-35B-Flash China Mobile 55.3%
59 Hy3-preview Tencent 54.7%
60 Step 3.5 Flash 2603 StepFun 54.3%
61 Gemini 3.5 Flash Google 53.3%
62 Qwen3.5 Omni Plus Alibaba 52.7%
63 EXAONE 4.5 33B LG AI Research 49.3%
64 Mistral Large 3 Mistral 46.5%
65 Mistral Small 4 Mistral 44.7%
66 Qwen3.5 Omni Flash Alibaba 44.0%
67 Claude 3.5 Sonnet Anthropic 38.1%
68 Mercury 2 Inception 36.3%
69 DeepSeek-V3 DeepSeek 35.9%
70 Nemotron 3 Nano Omni 30B A3B Reasoning NVIDIA 35.7%
71 Ling-2.6-1T InclusionAI 34.7%
72 Nemotron Cascade 2 30B A3B NVIDIA 34.0%
73 Trinity Large Thinking Arcee AI 33.0%
74 Gemma 4 E4B Google 30.7%
75 GPT-4 Turbo OpenAI 29.1%
76 Claude 3 Opus Anthropic 27.9%
77 Solar Pro 3 Upstage 27.0%
78 Grok-2 xAI 26.7%
79 Ling 2.6 Flash InclusionAI 25.0%
80 Gemini 1.5 Pro Google 24.4%
81 Qwen3.5 2B Alibaba 23.7%
82 Gemini 2.0 Flash Google 21.0%
83 Claude 2.1 Anthropic 19.5%
84 Granite 4.1 30B IBM 18.7%
85 Mistral Large Mistral 17.8%
86 Claude 3 Sonnet Anthropic 17.5%
87 NVIDIA Nemotron 3 Nano 4B NVIDIA 16.7%
88 Claude 3 Haiku Anthropic 15.4%
89 Gemma 4 E2B Google 15.0%
90 Granite 4.1 8B IBM 12.0%
91 JT-MINI China Mobile 11.7%
92 MiniCPM-V 4.6 1.3B OpenBMB 6.3%
93 Qwen3.5 0.8B Alibaba 5.3%
94 MiniCPM5-1B OpenBMB 3.7%
95 Granite 4.1 3B IBM 3.0%
96 Sarvam 105B Sarvam 0.0%
97 Sarvam 30B Sarvam 0.0%
98 LFM2 24B A2B Liquid AI 0.0%
99 Tiny Aya Global Cohere 0.0%