AI Flash Report

Gemini 3.1 Pro vs Kimi K2: Benchmarks, Pricing & Capabilities Compared

TL;DR — Gemini 3.1 Pro wins for coding + reasoning · Kimi K2 wins for cost.

Released
2026-02-19
Context window
2M tokens
Input price
$2.50 / Mtok
Output price
$10.00 / Mtok
Key features
  • 2x reasoning improvement
  • ARC-AGI-2 score of 77.1%
  • Enhanced multimodal understanding
Kimi K2 Moonshot AI
Released
2026-01-20
Context window
2M tokens
Input price
$0.15 / Mtok
Output price
$2.50 / Mtok
Key features
  • First open-weight model #1 on LMSYS Chatbot Arena
  • 1.04 trillion parameters
  • K2.5 agent swarms with up to 100 sub-agents

Benchmark comparison

Benchmark Gemini 3.1 Pro Kimi K2
GPQA Diamond 84.2% 74.1%
LiveCodeBench 78.9% 68.9%
SWE-bench Verified 72.3% 65.8%

Pricing comparison

Metric Gemini 3.1 Pro Kimi K2
Input ($/Mtok) $2.50 $0.15
Output ($/Mtok) $10.00 $2.50
Cached input ($/Mtok) $0.25
Cost per 1M-token roundtrip (1M in + 1M out) $12.50 $2.65

Context window & modalities

Attribute Gemini 3.1 Pro Kimi K2
Context window 2M tokens 2M tokens
Input modalities text, image, audio, video, PDF text, image
Output modalities text text
Knowledge cutoff 2025-12 2025-10

Verdict by use case

Coding
→ Gemini 3.1 Pro
Basis: SWE-bench

Gemini 3.1 Pro 72.3% vs Kimi K2 65.8% on SWE-bench.

Reasoning
→ Gemini 3.1 Pro
Basis: GPQA Diamond

Gemini 3.1 Pro 84.2% vs Kimi K2 74.1% on GPQA Diamond.

Math
Insufficient data
Basis: MATH / AIME

No shared math benchmark.

Long context
Tie
Basis: Context window

Gemini 3.1 Pro 2M tokens vs Kimi K2 2M tokens.

Cost
→ Kimi K2
Basis: Input $/Mtok

Gemini 3.1 Pro $2.5/Mtok vs Kimi K2 $0.15/Mtok input.

Changelog & releases

Gemini 3.1 Pro
Released 2026-02-19
Predecessor: google-gemini-3-pro
  • 2x reasoning score on ARC-AGI-2 vs Gemini 3 Pro
  • Context window expanded to 2M tokens
  • Deep Think mode enabled by default on the Pro tier
  • Lower latency on first-token despite larger context
Kimi K2
Released 2026-01-20
  • 2M token context window (20x vs first Kimi)
  • Agentic tool-use tuning via MuonClip optimizer
  • Open weights under modified MIT

Related comparisons