2026-02-19 16:41:14

Google today released Gemini 3.1 Pro. Just saw the test scores and feel this is aimed at dominating the leaderboard (model arms race continues, benefiting the semiconductor industry!)😂

The official positioning is very clear: designed specifically for complex tasks such as in-depth research, engineering challenges, long-chain reasoning, and agentic workflows.
Key highlights: 1M token context window (unchanged)
Multimodal support (text + images + video + audio + code)
Output up to 64k tokens
Performance comparison with current mainstream models (Claude Opus 4.6, GPT-5.2/5.3, etc.):
ARC-AGI-2 (the most difficult abstract reasoning benchmark):
Gemini 3.1 Pro 77.1%, approximately 8-9 percentage points ahead of Claude 4.6 (68.8%), and 20-30+ percentage points ahead of GPT-5 series. This is the biggest leap, representing a qualitative breakthrough in core reasoning.
GPQA Diamond (PhD-level scientific reasoning): 94.3%, slightly ahead of Claude 4.6 (91.3%) and GPT-5.2 (92.4%), with a gap of 2-3 percentage points, nearing saturation.
SWE-Bench Verified (real software engineering tasks): 80.6%, about 3-5 percentage points ahead of Claude 4.6 (around 76-77%), and significantly ahead of GPT (5-15%).
Others: Achieved top positions in long-term agent tasks such as Terminal-Bench, APEX-Agents; LMArena/Artificial Analysis index currently ranked first, with high cost efficiency.
More importantly, the cost advantage is obvious:
API pricing (per 1M tokens, based on latest Vertex AI / Gemini API data, standard price for ≤200k context):
Gemini 3.1 Pro: input $2.00, output $12.00 (doubling to $4/$18 for >200k context)
Claude Opus 4.6: input $5.00, output $25.00
GPT-5.2 / 5.x: typically $10–15+ for input, $30–75+ for output (higher tiers vary by version)
Advantage margin: Input: Gemini is about 60% cheaper than Claude (2 vs 5), and over 70–80% cheaper than GPT series.
Output: Gemini is about 52% cheaper than Claude (12 vs 25), and over 60–80% cheaper than GPT.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

1 Likes