Anthropic's Caching Bug Just Made Claude Max Worthless for Power Users

Dan
Anthropic's Caching Bug Just Made Claude Max Worthless for Power Users

On March 23, 2026, thousands of Claude Max subscribers started noticing something was very wrong. Their usage meters, which normally lasted the full 5-hour session window, were evaporating in under two hours. Some reported a single prompt jumping their usage from 21% to 100%. Max 20x subscribers paying $200 a month watched their quota hit zero after 90 minutes of normal work.

Anthropic stayed quiet for three days. Then on March 26, they posted a statement on Reddit explaining that session limits had been tightened during peak hours to manage demand. Weekly totals unchanged. Seven percent of users affected. Sorry for the inconvenience.

Anthropic's Caching Bug Just Made Claude Max Worthless for Power Users

That explanation is incomplete. And if you pay attention to the timeline, the real story becomes obvious.

This Is Not the First Time

Less than a month before the March 23 incident, Anthropic quietly reset Claude Code rate limits after a prompt caching bug caused usage to drain far faster than expected. That bug inflated token counts by incorrectly accounting for cached context, making the system believe it was doing far more work per request than it actually was.

That bug was fixed. Or so users were told.

Then March 23 happened again. Same symptoms. Same pattern. Usage jumping in sudden lurches rather than gradual consumption. Max 5x users spent in 90 minutes. Max 20x users going from 21% to 100% on a single prompt. These are not the numbers of a system managing load. These are the numbers of a system miscounting.

A load management decision produces slower, more gradual depletion. A caching bug produces sudden, inexplicable jumps.

What Anthropic Admitted vs What Actually Happened

Anthropic's official statement, issued by Thariq Shihipar on March 26, confirms that peak-hour session limits were deliberately tightened. Weekdays, 5am to 11am PT, your session burns faster. That part is intentional policy.

But that statement does not explain a usage meter jumping from 21% to 100% on a single prompt. That is not load balancing. That is a token accounting error. The two things are happening simultaneously, and Anthropic's statement addressed the policy change while leaving the bug conspicuously unacknowledged.

GitHub issue #38335 documents this precisely: users with the same workloads, same prompts, same Claude Code version, seeing tasks that previously consumed 20-30% of quota suddenly consuming 80-100% in one shot. That behavior started on March 23. It did not start the day Anthropic decided to tighten peak-hour limits.

The Business Context Makes This Worse

The Max plan tiers exist for one reason: to give power users and developers enough headroom that they never have to think about limits. Max 5x is $100 a month. Max 20x is $200 a month. These are not casual subscriptions. These are professional tools for people running agentic workloads, coding agents, long-form research sessions.

The value proposition is simple: you pay a premium, you get reliable access. A caching bug that silently drains a $200 subscription in 90 minutes breaks that contract completely. And doing it twice in a month, with no explicit acknowledgment of the bug the second time around, compounds the damage.

To be fair, the broader context is real. Anthropic's user base surged massively in March 2026 after OpenAI signed a Pentagon contract that triggered a wave of defections. ChatGPT uninstalls spiked 295% in a single day. Claude hit number one on the US App Store. That kind of growth puts genuine strain on GPU infrastructure, and tightening peak-hour limits is a defensible response to a real supply problem.

But acknowledging the demand issue does not make the caching bug go away. Both things are true at the same time, and conflating them into a single "we're managing demand" narrative leaves paying users without honest information about what actually happened to their quota.

What Should Have Happened

A clear incident report. Something like: "On March 23, we identified a regression in prompt cache accounting that caused token usage to be overcounted for certain workloads. This compounded the peak-hour limit adjustments we were simultaneously rolling out. We've corrected the accounting error. Here's what was affected."

Instead, users got three days of silence followed by a statement that only addressed the intentional policy change, leaving the bug quietly buried in the GitHub issue tracker where most subscribers will never see it.

The good news: Anthropic does fix these bugs when they're surfaced loudly enough. They did it the first time. They'll do it again. But the pattern of silent rollouts, slow acknowledgment, and incomplete explanations is becoming a recurring issue for a company charging professional rates for developer tooling.

If you're a Max subscriber and your usage is still evaporating faster than it should, you are not imagining it. Shift heavy workloads to off-peak hours (evenings, weekends) as a workaround while the fix propagates. And keep an eye on GitHub issue #38335 for updates from people actually tracking the underlying bug.

For a $200/month product, "it's probably a caching bug, check back later" is not an acceptable answer. Anthropic knows this. Now they need to act like it.

Related Articles