Here’s a concise, conversational recap of the Grok 4 announcement—no rambling, just the highlights you need.
What’s New in Grok 4
-
Two Fresh Models
-
Grok 4 (standard) -
Grok 4 Heavy (punishingly powerful)
Both are reasoning-only—the older non‑reasoning variants are gone.
-
-
Record‑Shattering Benchmarks
-
ARC‑AGI‑2 (PhD‑level exam; humans can’t pass):
-
Grok 4 with tools: 44% -
O3 with tools: 24% -
Claude Opus 4’s score roughly half of Grok 4’s
-
-
AIME (international math‑olympiad qualifier): 100%
-
-
Massive Context Window
-
256 000 tokens (up from 200 k in O3 & Sonnet 4) -
Still smaller than GPT 4.1 & Gemini’s 1 000 000 tokens
-
-
Better‑Than‑Ever Voice Mode
-
Latency markedly improved over ChatGPT Advanced voice
-
-
New Subscription Tier
-
$300/mo standalone plan in the Grok app
-
-
API Upgrades
-
Built‑in Search Tool
-
Available now to all API users
-
Pricing
Input Output Notes Grok 4 $3 / 1 M $15 / 1 M Matches Sonnet 4; higher than O3 & GPT 4.1 ≥128 k ctx price doubles price doubles Applies to both input and output tiers
-
-
What’s Coming Soon?
-
Dedicated Coding Model -
True Multimodal (text + image) -
Video Generation
-
Why It Matters
-
Reasoning‑only focus means more reliable chains of thought and fewer “I’m not sure” detours. -
Benchmark dominance pushes the bar for everyone—especially in hard, human‑level exams. -
Huge context window lets you feed in entire books, long legal contracts, or data logs without chopping. -
API search tool slashes integration work: no more wiring up external search for up‑to‑date info. -
Upcoming models hint at a full-stack AI suite—coding, images, even video.
FAQ
Q: How does the 256 k context compare to GPT 4.1?
A: GPT 4.1 and Gemini top out at around 1 M tokens—four times Grok 4’s window. But Grok 4’s 256 k still beats most models (O3, Sonnet 4 at 200 k).
Q: What does “reasoning‑only” really mean?
A: All weaker “fast” or “light” variants are retired. Every Grok 4 call runs the full chain‑of‑thought engine—no shortcuts.
Q: Is the new Voice Mode free?
A: It’s included in all Grok app tiers; latency is much lower than the old ChatGPT voice.
Q: When will the coding and multimodal models drop?
A: The team hinted “in the coming months”—likely a staged rollout through the rest of 2025.
Q: How do I avoid the 2×‑price jump after 128 k?
A: Keep your prompts under 128 k tokens, or budget accordingly—beyond that, both input and output rates double.