AI Flash Report

LiveCodeBench leaderboard

LiveCodeBench is a contamination-free coding benchmark composed of programming problems collected after model training cut-offs.

99 models ranked, highest score first.

LiveCodeBench leaderboard — 99 models ranked by score
# Model Company Score
1 GPT-5.2 OpenAI 88.9%
2 Claude Opus 4.5 Anthropic 87.1%
3 GPT-5.1 OpenAI 86.8%
4 DeepSeek V3.2 DeepSeek 86.2%
5 GPT-5.3 Codex OpenAI 84.2%
6 GPT-5.2 Codex OpenAI 80.4%
7 Gemini 3.1 Pro Google 78.9%
8 GPT-5.5 OpenAI 74.3%
9 MiniMax-M3 MiniMax 74.0%
10 GPT-5.4 OpenAI 74.0%
11 MiMo-V2.5-Pro Xiaomi 73.3%
12 Claude Opus 4.7 Anthropic 70.3%
13 Claude Fable 5 Anthropic 70.0%
14 Qwen3.6 Max Preview Alibaba 69.7%
15 Kimi K2.6 Kimi 69.7%
16 Muse Spark Meta 69.7%
17 Qwen3.6 Plus Alibaba 69.7%
18 Gemini 2.5 Flash Google 69.5%
19 GPT-5.4 mini OpenAI 69.3%
20 Qwen3.7 Max Alibaba 69.0%
21 Kimi K2 Moonshot AI 68.9%
22 Qwen3.6 27B Alibaba 68.7%
23 MiniMax-M2.7 MiniMax 68.7%
24 Claude Opus 4.8 Anthropic 67.7%
25 Qwen3.5 27B Alibaba 67.3%
26 Nemotron 3 Ultra 550B A55B NVIDIA 67.0%
27 MiMo-V2-Omni Xiaomi 66.7%
28 Qwen3.5 122B A10B Alibaba 66.7%
29 DeepSeek V4 Pro DeepSeek 66.3%
30 GPT-5.4 nano OpenAI 66.0%
31 Gemini 3.1 Flash-Lite Preview Google 65.3%
32 Qwen3.7 Plus Alibaba 65.0%
33 Ring-2.6-1T InclusionAI 64.3%
34 Grok 4.3 xAI 64.3%
35 Step 3.7 Flash StepFun 63.7%
36 Qwen3.6 35B A3B Alibaba 63.7%
37 MiMo-V2-Omni-0327 Xiaomi 63.7%
38 DeepSeek V4 Flash DeepSeek 63.0%
39 MiMo-V2.5 Xiaomi 62.7%
40 Qwen3.5 35B A3B Alibaba 62.7%
41 GLM-5.1 Z AI 62.3%
42 Gemma 4 31B Google 62.0%
43 Mistral Medium 3.5 Mistral 61.0%
44 GLM 5V Turbo Z AI 61.0%
45 MiMo-V2-Pro Xiaomi 60.7%
46 GLM-5-Turbo Z AI 60.7%
47 NVIDIA Nemotron 3 Super 120B A12B NVIDIA 60.0%
48 Grok 4.20 0309 xAI 59.0%
49 Qwen3.5 9B Alibaba 59.0%
50 Claude Sonnet 4.6 Anthropic 58.7%
51 Grok 4.20 0309 v2 xAI 58.0%
52 GPT-5 OpenAI 55.8%
53 GPT-5.5 Instant OpenAI 55.7%
54 Gemma 4 26B A4B Google 55.7%
55 Qwen3.5 4B Alibaba 55.7%
56 Gemma 4 12B Google 55.3%
57 JT-35B-Flash China Mobile 55.3%
58 Hy3-preview Tencent 54.7%
59 Step 3.5 Flash 2603 StepFun 54.3%
60 Gemini 3.5 Flash Google 53.3%
61 Qwen3.5 Omni Plus Alibaba 52.7%
62 EXAONE 4.5 33B LG AI Research 49.3%
63 Mistral Large 3 Mistral 46.5%
64 Mistral Small 4 Mistral 44.7%
65 Qwen3.5 Omni Flash Alibaba 44.0%
66 Claude 3.5 Sonnet Anthropic 38.1%
67 DeepSeek-V3 DeepSeek 35.9%
68 Nemotron 3 Nano Omni 30B A3B Reasoning NVIDIA 35.7%
69 Ling-2.6-1T InclusionAI 34.7%
70 Nemotron Cascade 2 30B A3B NVIDIA 34.0%
71 Trinity Large Thinking Arcee AI 33.0%
72 North Mini Code Cohere 32.3%
73 HyperNova 60B 2605 Multiverse Computing 31.7%
74 Gemma 4 E4B Google 30.7%
75 GPT-4 Turbo OpenAI 29.1%
76 Claude 3 Opus Anthropic 27.9%
77 Solar Pro 3 Upstage 27.0%
78 Grok-2 xAI 26.7%
79 Ling 2.6 Flash InclusionAI 25.0%
80 Gemini 1.5 Pro Google 24.4%
81 Qwen3.5 2B Alibaba 23.7%
82 Gemini 2.0 Flash Google 21.0%
83 Claude 2.1 Anthropic 19.5%
84 Granite 4.1 30B IBM 18.7%
85 Mistral Large Mistral 17.8%
86 Claude 3 Sonnet Anthropic 17.5%
87 NVIDIA Nemotron 3 Nano 4B NVIDIA 16.7%
88 Claude 3 Haiku Anthropic 15.4%
89 Gemma 4 E2B Google 15.0%
90 Granite 4.1 8B IBM 12.0%
91 JT-MINI China Mobile 11.7%
92 MiniCPM-V 4.6 1.3B OpenBMB 6.3%
93 Qwen3.5 0.8B Alibaba 5.3%
94 MiniCPM5-1B OpenBMB 3.7%
95 Granite 4.1 3B IBM 3.0%
96 LFM2.5-8B-A1B Liquid AI 0.0%
97 Sarvam 105B Sarvam 0.0%
98 Sarvam 30B Sarvam 0.0%
99 LFM2 24B A2B Liquid AI 0.0%