Here’s a concise, conversational recap of the Grok 4 announcement—no rambling, just the highlights you need.


What’s New in Grok 4

  1. Two Fresh Models

    • Grok 4 (standard)
    • Grok 4 Heavy (punishingly powerful)
      Both are reasoning-only—the older non‑reasoning variants are gone.
  2. Record‑Shattering Benchmarks

    • ARC‑AGI‑2 (PhD‑level exam; humans can’t pass):

      • Grok 4 with tools: 44%
      • O3 with tools: 24%
      • Claude Opus 4’s score roughly half of Grok 4’s
    • AIME (international math‑olympiad qualifier): 100%

  3. Massive Context Window

    • 256 000 tokens (up from 200 k in O3 & Sonnet 4)
    • Still smaller than GPT 4.1 & Gemini’s 1 000 000 tokens
  4. Better‑Than‑Ever Voice Mode

    • Latency markedly improved over ChatGPT Advanced voice
  5. New Subscription Tier

    • $300/mo standalone plan in the Grok app
  6. API Upgrades

    • Built‑in Search Tool

    • Available now to all API users

    • Pricing

      Input Output Notes
      Grok 4 $3 / 1 M $15 / 1 M Matches Sonnet 4; higher than O3 & GPT 4.1
      ≥128 k ctx price doubles price doubles Applies to both input and output tiers
  7. What’s Coming Soon?

    • Dedicated Coding Model
    • True Multimodal (text + image)
    • Video Generation

Why It Matters

  • Reasoning‑only focus means more reliable chains of thought and fewer “I’m not sure” detours.
  • Benchmark dominance pushes the bar for everyone—especially in hard, human‑level exams.
  • Huge context window lets you feed in entire books, long legal contracts, or data logs without chopping.
  • API search tool slashes integration work: no more wiring up external search for up‑to‑date info.
  • Upcoming models hint at a full-stack AI suite—coding, images, even video.

FAQ

Q: How does the 256 k context compare to GPT 4.1?
A: GPT 4.1 and Gemini top out at around 1 M tokens—four times Grok 4’s window. But Grok 4’s 256 k still beats most models (O3, Sonnet 4 at 200 k).

Q: What does “reasoning‑only” really mean?
A: All weaker “fast” or “light” variants are retired. Every Grok 4 call runs the full chain‑of‑thought engine—no shortcuts.

Q: Is the new Voice Mode free?
A: It’s included in all Grok app tiers; latency is much lower than the old ChatGPT voice.

Q: When will the coding and multimodal models drop?
A: The team hinted “in the coming months”—likely a staged rollout through the rest of 2025.

Q: How do I avoid the 2×‑price jump after 128 k?
A: Keep your prompts under 128 k tokens, or budget accordingly—beyond that, both input and output rates double.