North Mini Code vs Step 3.7 Flash: Benchmarks, Pricing & Capabilities Compared
TL;DR — North Mini Code wins for general use · Step 3.7 Flash wins for reasoning.
North Mini Code Cohere
- Released
- 2026-06-09
- Context window
- 256K tokens
- Input price
- $0.00 / Mtok
- Output price
- $0.00 / Mtok
Step 3.7 Flash StepFun
- Released
- 2026-05-29
- Context window
- 256K tokens
- Input price
- $0.20 / Mtok
- Output price
- $1.15 / Mtok
Benchmark comparison
| Benchmark | North Mini Code | Step 3.7 Flash |
|---|---|---|
| GPQA Diamond | 75.7% | 80.9% ✓ |
| HLE | 9.9% | 19.9% ✓ |
| IF-Bench | 57.6% | 67.3% ✓ |
| LiveCodeBench Reasoning | 32.3% | 63.7% ✓ |
| SciCode | 38.2% | 40.0% ✓ |
| TAU2-bench | 37.4% | 98.5% ✓ |
| TerminalBench-Hard | 31.1% | 35.6% ✓ |
Pricing comparison
| Metric | North Mini Code | Step 3.7 Flash |
|---|---|---|
| Input ($/Mtok) | $0.00 | $0.20 |
| Output ($/Mtok) | $0.00 | $1.15 |
| Cached input ($/Mtok) | — | — |
| Cost per 1M-token roundtrip (1M in + 1M out) | $0.00 | $1.35 |
Context window & modalities
| Attribute | North Mini Code | Step 3.7 Flash |
|---|---|---|
| Context window | 256K tokens | 256K tokens |
| Input modalities | text | text, image |
| Output modalities | text | text |
| Knowledge cutoff | — | — |
Verdict by use case
Coding
Insufficient data
Basis: SWE-bench
No shared coding benchmark.
Reasoning
→ Step 3.7 Flash
Basis: GPQA Diamond
North Mini Code 75.7% vs Step 3.7 Flash 80.9% on GPQA Diamond.
Math
Insufficient data
Basis: MATH / AIME
No shared math benchmark.
Long context
Tie
Basis: Context window
North Mini Code 256K tokens vs Step 3.7 Flash 256K tokens.
Cost
→ North Mini Code
Basis: Input $/Mtok
North Mini Code $0/Mtok vs Step 3.7 Flash $0.2/Mtok input.
Changelog & releases
North Mini Code
Released 2026-06-09
Step 3.7 Flash
Released 2026-05-29