AI Flash Report

SciCode leaderboard

96 models ranked, highest score first.

SciCode leaderboard — 96 models ranked by score
# Model Company Score
1 Claude Fable 5 Anthropic 60.2%
2 GPT-5.4 OpenAI 56.6%
3 GPT-5.5 OpenAI 56.1%
4 Claude Opus 4.7 Anthropic 54.5%
5 Claude Opus 4.8 Anthropic 53.5%
6 Kimi K2.6 Kimi 53.5%
7 GPT-5.3 Codex OpenAI 53.2%
8 GPT-5.2 OpenAI 52.1%
9 Muse Spark Meta 51.5%
10 GPT-5.5 Instant OpenAI 50.3%
11 MiMo-V2.5-Pro Xiaomi 50.2%
12 DeepSeek V4 Pro DeepSeek 50.0%
13 GPT-5.4 mini OpenAI 49.9%
14 Claude Opus 4.5 Anthropic 49.5%
15 Qwen3.7 Max Alibaba 48.8%
16 Gemini 3.5 Flash Google 48.8%
17 Grok 4.3 xAI 47.3%
18 MiniMax-M2.7 MiniMax 47.0%
19 Qwen3.6 Max Preview Alibaba 46.9%
20 GPT-5.4 nano OpenAI 46.9%
21 Grok 4.20 0309 v2 xAI 45.6%
22 Qwen3.7 Plus Alibaba 45.5%
23 MiniMax-M3 MiniMax 45.4%
24 DeepSeek V4 Flash DeepSeek 44.9%
25 Grok 4.20 0309 xAI 44.7%
26 Claude Sonnet 4.6 Anthropic 44.1%
27 GLM-5.1 Z AI 43.8%
28 GLM-5-Turbo Z AI 43.6%
29 GLM 5V Turbo Z AI 43.5%
30 Gemma 4 31B Google 43.4%
31 GPT-5.1 OpenAI 43.3%
32 MiMo-V2.5 Xiaomi 43.1%
33 MiMo-V2-Pro Xiaomi 42.5%
34 Ring-2.6-1T InclusionAI 42.4%
35 Qwen3.5 122B A10B Alibaba 42.0%
36 Gemini 3.1 Flash-Lite Preview Google 41.9%
37 Hy3-preview Tencent 41.2%
38 Qwen3.6 Plus Alibaba 40.7%
39 Qwen3.5 Omni Plus Alibaba 40.5%
40 Step 3.7 Flash StepFun 40.0%
41 Gemma 4 26B A4B Google 40.0%
42 Nemotron 3 Ultra 550B A55B NVIDIA 39.9%
43 Qwen3.6 27B Alibaba 39.8%
44 Mistral Medium 3.5 Mistral 39.6%
45 MiMo-V2-Omni-0327 Xiaomi 39.5%
46 Qwen3.5 27B Alibaba 39.5%
47 Gemini 2.5 Flash Google 39.4%
48 DeepSeek V3.2 DeepSeek 38.9%
49 GPT-5 OpenAI 38.8%
50 Step 3.5 Flash 2603 StepFun 38.5%
51 North Mini Code Cohere 38.2%
52 Gemma 4 12B Google 38.2%
53 Mistral Small 4 Mistral 38.0%
54 Qwen3.5 35B A3B Alibaba 37.7%
55 Ling-2.6-1T InclusionAI 37.0%
56 MiMo-V2-Omni Xiaomi 36.7%
57 Mistral Large 3 Mistral 36.2%
58 Trinity Large Thinking Arcee AI 36.1%
59 NVIDIA Nemotron 3 Super 120B A12B NVIDIA 36.0%
60 Qwen3.6 35B A3B Alibaba 35.8%
61 DeepSeek-V3 DeepSeek 35.4%
62 Nemotron Cascade 2 30B A3B NVIDIA 34.8%
63 Gemini 2.0 Flash Google 34.0%
64 HyperNova 60B 2605 Multiverse Computing 33.0%
65 GPT-4 Turbo OpenAI 31.9%
66 Claude 3.5 Sonnet Anthropic 31.6%
67 JT-35B-Flash China Mobile 29.1%
68 Grok-2 xAI 28.5%
69 EXAONE 4.5 33B LG AI Research 28.0%
70 Nemotron 3 Nano Omni 30B A3B Reasoning NVIDIA 27.8%
71 Qwen3.5 9B Alibaba 27.5%
72 Gemini 1.5 Pro Google 27.4%
73 JT-MINI China Mobile 27.2%
74 Ling 2.6 Flash InclusionAI 27.1%
75 Sarvam 105B Sarvam 26.4%
76 Granite 4.1 30B IBM 25.8%
77 Qwen3.5 Omni Flash Alibaba 25.5%
78 Solar Pro 3 Upstage 24.7%
79 Gemma 4 E4B Google 24.4%
80 Claude 3 Opus Anthropic 23.3%
81 Claude 3 Sonnet Anthropic 22.9%
82 Granite 4.1 8B IBM 21.8%
83 Gemma 4 E2B Google 20.9%
84 Mistral Large Mistral 20.8%
85 Sarvam 30B Sarvam 19.2%
86 Claude 3 Haiku Anthropic 18.6%
87 Claude 2.1 Anthropic 18.4%
88 NVIDIA Nemotron 3 Nano 4B NVIDIA 16.4%
89 Qwen3.5 4B Alibaba 16.1%
90 Granite 4.1 3B IBM 11.9%
91 LFM2 24B A2B Liquid AI 10.9%
92 LFM2.5-8B-A1B Liquid AI 7.8%
93 MiniCPM5-1B OpenBMB 4.4%
94 Qwen3.5 2B Alibaba 2.8%
95 MiniCPM-V 4.6 1.3B OpenBMB 2.1%
96 Qwen3.5 0.8B Alibaba 0.0%