Claude Opus 4.7 Update: 3x Better Vision, 3x More Production Tasks Completed

On April 16, 2024, Anthropic quietly upgraded Claude Opus from version 4.6 to 4.7. Officially, it’s a “generally available minor iteration.” But if you dig into the release notes, this update brings substantial changes — especially for developers and users who rely on complex visual information.

Claude Opus 4.7 update illustration

Vision: From “Can’t Process” to “Just Upload”

The most obvious improvement in Claude Opus 4.7 is visual processing. Previous versions capped images at roughly 800 pixels on the long side. If you tried to upload a detailed chart, technical schematic, or product UI screenshot, you often had to compress or crop it first. Version 4.7 raises that limit to 2,576 pixels on the long side — about 3.75 megapixels, or more than three times previous capacity.

What does that mean in practice? Take a chemical structure diagram with dense atomic markers and bond lines. At low resolution, the model might miss a substituent’s position. Or a multi‑layer PCB schematic with tiny annotations — before, sending that to the model was nearly useless. Now you can upload the original file. Anthropic didn’t publish accuracy gains for these specific tasks, but tripling the pixel budget means the model sees far more detail.

In short: you no longer have to degrade image quality to fit the model. Snap a high‑res product screenshot or scan a technical manual page, hand it to Claude Opus 4.7, and it will see roughly what you see.

Coding: 3x More Real‑World Tasks Completed

The coding improvements are even more interesting. Anthropic ran benchmarks on 93 coding tasks: Claude Opus 4.7 solved 13% more than 4.6. A 13% gain is a normal upgrade — nothing shocking. But another number stands out: on 「real‑world production tasks」, version 4.7 completed 「three times as much work」 as 4.6.

The key phrase is “real‑world production tasks.” Benchmarks are standardized, finite problems. Production work includes edge cases, legacy code, inconsistent naming across files, poorly handled third‑party dependencies, and all the messiness of actual software. A minor version bump that triples throughput in that environment suggests a qualitative change in how the model handles complex, long‑context, non‑ideal conditions.

Self‑Verification: No More Wandering Off

The improvement in long‑task consistency explains much of that production gain. Earlier versions, when asked to build a project spanning multiple files — say, a frontend component with an API backend plus a database migration — could start drifting halfway through. Or they would reinvent the wheel, redefining a helper function that already existed earlier in the conversation.

Claude Opus 4.7 adds a 「self‑verification step」 during generation. After writing a chunk of code, it checks whether that chunk is consistent with previous logic and whether it follows the original user request. Then it decides whether to revise. This isn’t flashy, but for long tasks it’s a big deal. Think of it as an internal reviewer that runs while the model writes, not after.

Instruction Following: More Literal, Old Prompts May Break

There’s one easily overlooked change with wide impact: Claude Opus 4.7 follows instructions more literally. You say exactly what you want, and it does exactly that — without adding extra interpretation.

For example, earlier you might write “summarize the main points of this article.” The model might automatically add formatting like “use three paragraphs, each under 50 words,” because it learned from training data that summaries often include those constraints. Now, if you only say “summarize the main points,” you might get a simple bullet list or a few sentences — no automatic beautification, deduplication, or categorization.

This change is a hidden trap for existing users. If your current prompts rely on the model “filling in the blanks” — you give a rough direction and expect it to add details — then switching to 4.7 may produce outputs that feel too bare or miss your intent. Anthropic explicitly mentions this in the release notes: 「users are advised to rework their existing prompts.」 It’s rare for a model provider to highlight a breaking change like this. They usually emphasize backward compatibility. The fact that they call it out means the shift is significant.

How to adjust your prompts?

If you used short instructions like “analyze the performance issues in this code,” change to something more specific: “Go through each line of the following code, identify loops with time complexity worse than O(n), suggest an optimization for each issue, and output the results as a numbered list.” Write out every step you want. Don’t leave it to guesswork.

Price Unchanged, But Your Bill May Rise

Claude Opus 4.7 keeps the same API pricing as 4.6: 25 per million output tokens. The model identifier in the API is claude-opus-4-7.

However, two things can increase your actual costs.

First: the tokenizer changed

The tokenizer — the tool that chops text into tokens — is new in 4.7. The same text now produces more tokens. Official range: 1× to 1.35× the previous count. That means a prompt that used 1,000 tokens may now require 1,000–1,350 tokens. Both input and output are affected.

Second: higher effort levels produce longer outputs

Version 4.7 adds an xhigh effort level (more on that below). If you turn it up, the model does deeper reasoning and typically generates longer responses. More output tokens, multiplied by the $25 per million rate, directly increases your cost per call.

Combine the two: tokenizer adds 0–35% more base consumption, and xhigh makes outputs longer. The per‑token price didn’t move, but your token usage likely will. If you run in production, test a small batch of real tasks first, compare average cost per call between 4.6 and 4.7, and don’t rely solely on the published rate card.

New Toolchain Options: Four Additions for Developers

Anthropic added several toolchain features in 4.7. If you’re a developer or heavy API user, these four are worth knowing.

1. New xhigh effort level

Previous effort levels: low, medium, high. Version 4.7 adds xhigh above high. Deeper reasoning, higher latency. Use it for complex tasks that require careful verification — math proofs, multi‑step logical reasoning, large code refactors. For simple queries, medium or high is sufficient.

effort level best for latency output length
low simple Q&A, classification, extraction lowest short
medium regular conversation, summarization low medium
high complex analysis, code generation moderate long
xhigh math proofs, deep reasoning, big refactors highest longest

2. Task budget (beta)

Previously, you could only see token spend after a task finished. The task budget feature (public beta) lets you 「set a maximum token budget before the task starts」. Useful for cost‑sensitive production environments. For example, set a limit of 5,000 tokens for a task; the model will automatically wrap up when it gets close.

3. /ultrareview – dedicated code review mode

This lives inside Claude Code. Earlier, to do a code review you had to write a long prompt: “please check this code for security holes, performance issues, style violations…” Now there’s a dedicated entry point: /ultrareview. It applies an optimized review strategy automatically — no prompt engineering required. The depth and focus are better than using the general‑purpose mode.

4. Auto mode – automatic long‑task decisions

Auto mode is now available to Max users. On long tasks, the model can decide what to do next without asking you to confirm each step. Example: you ask it to “crawl the first three levels of this website and extract the title and meta description from each page.” Earlier, the model might pause after every page and ask “continue?” Now it can keep going until the task finishes or hits an exception that truly needs human input. Less control for those who like manual steering, but much higher throughput for batch jobs.

Safety: Some Ups, Some Downs, Reported Honestly

Overall safety metrics for Claude Opus 4.7 are roughly flat compared to 4.6, but with internal trade‑offs.

「Improvements:」

  • Honesty: the model is more likely to say “I don’t know” instead of making things up.
  • Prompt injection resistance: malicious instruction injection attacks are less effective.

「Flat:」

  • Deception rate: no meaningful change, remains low.
  • Sycophancy rate: the model does not show increased tendency to change answers to please the user.

「Worse:」

  • In a few specific “harm reduction” sub‑scenarios, version 4.7 performs slightly worse than 4.6. Anthropic lists those scenarios and the test data in the release notes. No hiding. That level of transparency is rare among large vendors — most would only talk about improvements or vaguely wave at weaknesses.

Also, Claude Opus 4.7 integrates with Project Glasswing and automatically blocks high‑risk cybersecurity requests, such as generating exploit code or providing step‑by‑step bypass instructions. If you are doing legitimate security research (not malicious attacks), you can request exceptional access through Anthropic’s Cyber Verification Program.

Frequently Asked Questions

「What are the main differences between Claude Opus 4.7 and 4.6?」

Vision: long‑side pixel limit tripled from ~800 to 2,576. Real‑world production tasks completed: tripled. Instruction following: more literal. New xhigh effort level, task budget public beta, /ultrareview code review mode, Auto mode for Max users. Tokenizer updated → same text uses more tokens.

「Do I need to adjust my existing prompts?」

Yes. Version 4.7 follows instructions more literally and does not “fill in the blanks” the way 4.6 did. If your prompts are short and rely on the model adding implicit details, rewrite them to be more explicit and step‑by‑step. Anthropic explicitly advises users to rework prompts.

「Will my API bill go up after upgrading?」

The per‑token price is unchanged, but your actual cost may rise for two reasons: the new tokenizer increases token consumption by 0–35%, and using xhigh effort produces longer outputs. Run a test with your real workloads before switching production traffic.

「When should I use xhigh effort?」

Math proofs, multi‑step logical reasoning, large code refactors, tasks that require deep verification. For simple Q&A or routine conversations, low or medium is fine — xhigh adds unnecessary latency and cost.

「What is /ultrareview and how is it different from normal code review?」

/ultrareview is a dedicated mode inside Claude Code, optimized specifically for code review. It applies a built‑in strategy for security, performance, and style checks. No prompt needed. It is more focused and thorough than the general‑purpose chat mode.

「What is Auto mode and who can use it?」

Auto mode lets the model decide the next action automatically during long tasks, without waiting for user confirmation after each step. Available to Max users. Suitable for batch processing, crawling, multi‑step transformations.

「How does the tokenizer change affect me?」

The same text now produces more tokens, for both input and output. If you have tight token budgets from previous usage, you will need to recalibrate.

「Is Claude Opus 4.7 safer than 4.6?」

Overall, safety is roughly flat, but with trade‑offs. Honesty and prompt injection resistance improved. A few harm‑reduction sub‑scenarios regressed slightly. Anthropic published both the wins and the regressions.

Summary: Two Dimensions of Upgrades

This release can be split into two directions.

「First: performance improvements」 for end users. Vision triples in resolution. Real‑world production task completion triples. Long‑task consistency improves noticeably because of the self‑verification mechanism. These are differences you can feel without caring about technical details — clearer image handling, more reliable code, fewer long projects that wander off track.

「Second: toolchain enrichment」 for developers. xhigh adds a deeper reasoning option for hard problems. Task budget lets you cap token spend upfront. /ultrareview gives you a dedicated code review entry point. Auto mode makes long tasks more autonomous. These all answer one question: how do you make the model smoother and more controllable to use?

A minor version bump that serves both casual users and developers at the same time is uncommon in today’s model release cadence. If you are currently using Claude Opus 4.6, try 4.7 in a non‑production environment first. Focus on two things: whether your prompts need adjustment, and how your actual costs change. Once you have those answers, decide when to fully switch over.