For six weeks starting in early March 2026, developers using Claude Code began noticing that something was off. Responses felt less focused. Long sessions appeared to lose context the model had already processed earlier in the same task. Output quality on complex coding problems had dropped in ways that were hard to quantify but easy to feel after a few hours of work. On April 23, Anthropic published an engineering postmortem naming what had actually happened: three separate product-layer changes, each made for entirely defensible reasons, had overlapped in ways that nobody anticipated and whose combined effect took weeks to isolate.
Three Changes, One Compounding Effect
The first change arrived on March 4, when Anthropic's team lowered Claude Code's default reasoning effort from high to medium. The motivation was real: during extended thinking periods, the interface appeared frozen, and users were unsure whether Claude was working or had crashed. Reducing thinking time eliminated the confusion. The postmortem acknowledges plainly that it was "the wrong tradeoff." The setting was reversed on April 7, and with it Anthropic bumped the reasoning tier for Opus 4.7 users to what the company calls "xhigh" effort, above the previous high setting.
The second change came on March 26. Anthropic modified how the product managed context in sessions left idle for more than an hour. The intent was to clear older accumulated thinking from those sessions to reduce latency when a user resumed work. A bug caused the clearing to run on every subsequent turn for the rest of the session instead of running once at resumption. The practical effect was a model that appeared to forget earlier context continuously, because it was. The bug was fixed on April 10.
Key Facts
- Quality degradation beganMarch 4, 2026
- Changes identified3 unrelated product-layer modifications
- Duration of impact~6 weeks (March 4 to April 20)
- Coding quality drop (ablation testing)3% across Opus 4.6 and Opus 4.7
- Full resolutionClaude Code v2.1.116, April 20, 2026
- API and model weightsNot affected
The Third Strike
The third change, on April 16, added a system prompt instruction to reduce verbosity in Claude's responses. In isolation it was minor. In combination with the other prompt modifications already in the configuration, it produced a measurable degradation in coding output. Anthropic's own ablation testing, run after the fact, confirmed a three-percent drop in coding quality evaluations across both Opus 4.6 and Opus 4.7. The instruction was reverted on April 20. That same day, Claude Code version 2.1.116 shipped with all three issues resolved.
Anthropic published the postmortem on April 23, three days after the final fix. The timing was deliberate: the team wanted to confirm all three issues were fully clear before the public explanation went out. The document names the specific date of each change, the date of each reversion, and the evaluation data supporting the conclusions. It is notably specific for a quality incident disclosure, including a direct acknowledgment that the reasoning effort downgrade was the wrong call, not a tradeoff that went unexpectedly wrong, but a decision the postmortem retroactively marks as mistaken.
"Changes to Claude's harnesses and operating instructions likely caused degradation. We apologize for the impact this has had on your work." Anthropic Engineering, April 2026 Claude Code Postmortem
What the Postmortem Reveals About AI Product Infrastructure
The episode points to a structural challenge in how modern AI products are deployed and maintained. Model weights go through extensive evaluation before release. The product-layer scaffolding that sits above those weights, the system prompts, caching policies, and interface parameters, can be modified more quickly and with less evaluation rigor. When three such changes overlap across a seven-week window, the compounding effects can be difficult to isolate. Each change made sense when considered on its own. None of the individual changes would necessarily have triggered a visible quality regression in isolation. Together, they did.
All three issues were fully resolved in v2.1.116. Anthropic reset usage limits for all subscribers to compensate for the degraded period. The API and the underlying model weights were never affected, meaning the same model running at its intended configuration produced the expected quality throughout the six-week window. For users who noticed the regression, the practical message is that Claude Code now reflects the quality the product was designed to deliver. For the industry, the postmortem sets a useful precedent for what candid, data-backed disclosure of a product quality incident looks like.