Claude Opus 4.8 Launches to Favorable Reviews
Claude Opus 4.8 shipped May 28 at the same price, with sharper judgment and honesty. Effort control and dynamic workflows landed to a positive early reception.
Anthropic released Claude Opus 4.8 on May 28, holding its predecessor's pricing flat while pushing only on performance — $5 per million input tokens and $25 per million output, with the model callable as claude-opus-4-8 through the Claude API.
Anthropic called the update a modest but tangible improvement, but early developer reactions ran hotter. GitHub Copilot, Windsurf, and Devin integrated it on launch day, and two new features — effort control and dynamic workflows — shipped alongside as praise spread for the free performance boost.
Price Freeze and Enhanced Reliability
The lead improvement is what Anthropic calls honesty. Opus 4.8 is about four times less likely than its predecessor to let a flaw in its own code slip by, and it flags uncertain parts upfront instead of declaring a task done on thin evidence.
Its judgment improved too. Early testers report that in Claude Code, Opus 4.8 asks clarifying questions, fixes its own errors, and pushes back on shaky plans. On computer use it scored 84% on the Online-Mind2Web benchmark, ahead of both 4.7 and GPT-5.5.
Partner scorecards agree. On Harvey's Legal Agent Benchmark, Opus 4.8 set a record 10.4% and became the first model past the 10% all-pass mark. Cognition, which builds the coding agent Devin, said the verbose-comment and tool-calling issues from 4.7 are fixed.
Effort Controls and Dynamic Workflows
The features shipped alongside Opus 4.8 matter as much as the model itself. The first, effort control, arrived in Claude.ai and Cowork: next to the model picker, users set how hard Claude works on a response. Opus 4.8 defaults to 'high,' scaling up to 'extra' ('xhigh' in Claude Code) or 'max' for tougher tasks, on every plan. Lower effort control settings answer faster and spend rate limits more slowly, and Anthropic recommends the 'extra' effort control level for hard, long-running jobs.
The second, dynamic workflows, launched in research preview. Claude Code plans the work, spawns hundreds of parallel subagents in one session, verifies the output, and reports back — enough, Anthropic says, to run a codebase-scale migration across hundreds of thousands of lines from kickoff to merge against the existing test suite. These dynamic workflows roll out first on Enterprise, Team, and Max plans, extending the Claude Code agent-teams paradigm; paired with Opus 4.8, the dynamic workflows run longer before reporting back.
A developer-side change rounds it out: the Messages API now takes system entries inside the messages array, so instructions update mid-task without breaking the prompt cache. Fast mode runs the same model 2.5x faster at a third of the earlier cost.
Immediate Real-World Adoption and Praise
On the feature sheet it reads incremental, but the people using it moved first. GitHub Copilot made Opus 4.8 generally available on launch day, praising a clear step forward in code comprehension; Windsurf, the Devin CLI, and Genspark added same-day support. Japanese developers called it the most capable model available now, and Chinese-language posts framed it as more capability at the same price. Elon Musk added a 'nice work.'
Not all of it was praise. Some flagged Anthropic for highlighting only its own scores in pink on the benchmark charts, and the rate-limit gripe — a single hello burning the monthly cap — returned, along with the long-running debate over degradation in Claude's output. None of it overturned the day-one mood.
Anthropic is already lining up what's next: a cheaper model with Opus-class capability, and a tier above Opus in raw intelligence. Under Project Glasswing, a few organizations are testing the Mythos Preview model for cybersecurity, and Anthropic says Mythos-class models reach everyone within weeks once stronger safeguards are ready. If 4.8 is the modest step, the real leap is staged over there.