Anthropic Off Peak: Caching Bug Just Made Claude Max Worthless for Power Users

Dan
Anthropic Off Peak: Caching Bug Just Made Claude Max Worthless for Power Users
On March 23, 2026, thousands of Claude Max subscribers started noticing something was very wrong. Their usage meters, which normally lasted the full five hour session limits, were evaporating in under two hours. Many Claude users hit session limits much sooner than expected. Some reported a single prompt jumping their usage from 21% to 100%. Max 20x subscribers paying $200 a month watched their quota hit zero after 90 minutes of normal work. Anthropic stayed quiet for three days. Then on March 26, they posted a statement on Reddit explaining that session limits had been tightened during peak hours to manage demand. Weekly totals unchanged. Seven percent of customers affected. Sorry for the inconvenience. That explanation is incomplete. And if you pay attention to the timeline, the real story becomes obvious. ## This Is Not the First Time Less than a month before the March 23 incident, Anthropic quietly reset Claude Code rate limits after a prompt caching bug caused usage to drain far faster than expected. That bug inflated token counts by incorrectly accounting for cached context, making the system believe it was doing far more work per request than it actually was. The type of task being performed, along with other factors such as conversation complexity, features used, and the specific Claude model, can all influence token consumption and how quickly session limits are reached. That bug was fixed. Or so users were told. Then March 23 happened again. Same symptoms. Same pattern. Usage jumping in sudden lurches rather than gradual consumption. Max 5x users spent in 90 minutes. Max 20x users going from 21% to 100% on a single prompt. These are not the numbers of a system managing load. These are the numbers of a system miscounting. A load management decision produces slower, more gradual depletion. A caching bug produces sudden, inexplicable jumps. ## What Anthropic Admitted vs What Actually Happened Anthropic's official post, issued by Thariq Shihipar on March 26, confirms that peak-hour session limits were deliberately tightened. Weekdays, 5am to 11am PT, your session burns faster. That part is intentional policy. But that post does not explain a usage meter jumping from 21% to 100% on a single prompt. That is not load balancing. That is a token accounting error, specifically in how many tokens were being miscounted due to the bug. The two things are happening simultaneously, and Anthropic's post addressed the policy change while leaving the bug conspicuously unacknowledged. GitHub issue #38335 documents this precisely: users with the same workloads, same prompts, same Claude Code version, seeing tasks that previously consumed 20-30% of quota suddenly consuming 80-100% in one shot. That behavior started on March 23. It did not start the day Anthropic decided to tighten peak-hour limits. ## The Business Context Makes This Worse The Max plan tiers—including pro max and team—exist for one reason: to give power users, businesses, and enterprise customers enough headroom that they never have to think about usage caps or usage limits, even during weekday peak hours. These pro tiers (free pro max, pro max and team) offer higher allowances compared to free users or the basic free plan, supporting collaborative workflows for teams and organizations. Max 5x is $100 a month, while Max 20x is $200 a month. These are not casual subscriptions; they are professional tools and services for people running agentic workloads, coding agents, token intensive background jobs, or long-form research sessions on desktop, Excel, PowerPoint, and other integrations. The value proposition is simple: you pay a premium, you get reliable access to the Claude model and related services, with extra usage and higher weekly limits, especially outside weekday peak hours when limits during off peak are increased to accommodate growing demand. A caching bug that silently drains a $200 subscription in 90 minutes breaks that contract completely. And doing it twice in a week, with no explicit acknowledgment of the bug the second time around, compounds the damage and undermines long term trust. To be fair, the broader context is real. Anthropic's user base surged massively in March 2026 after OpenAI signed a Pentagon contract that triggered a wave of defections. ChatGPT uninstalls spiked 295% in a single day. Claude hit number one on the US App Store. That kind of growth puts genuine strain on GPU infrastructure, and tightening usage limits and weekly limits during weekday peak hours is a defensible response to a real supply problem, much like a utility company or other AI companies managing demand. Shifting token intensive background jobs and encouraging users to build habits around off-peak usage helps balance capacity across the week. But acknowledging the demand issue does not make the caching bug go away. Both things are true at the same time, and conflating them into a single "we're managing demand" marketing narrative leaves paying customers, whether accessing via API, desktop, or web, without honest information about what actually happened to their quota. Claude was created as a tool to empower businesses, teams, and individuals, and transparency is essential for long term adoption and trust in the service, whether users are leveraging Claude Cowork, Claude Opus, or other models in the suite. ## What Should Have Happened A clear incident report. Something like: "On March 23, we identified a regression in prompt cache accounting that caused token usage to be overcounted for certain workloads. This compounded the peak-hour limit adjustments we were simultaneously rolling out. We've corrected the accounting error. Here's what was affected." Instead, customers got three days of silence followed by a statement that only addressed the intentional policy change, leaving the bug quietly buried in the GitHub issue tracker where most subscribers will never see it. The good news: Anthropic does fix these bugs when they're surfaced loudly enough. They did it the first time. They'll do it again. But the pattern of silent rollouts, slow acknowledgment, and incomplete explanations is becoming a recurring issue for a company charging professional rates for developer tooling. If you're a Max subscriber and your usage is still evaporating faster than it should, you are not imagining it. Customers should shift heavy workloads to off-peak hours (evenings, weekends) as a workaround while the fix propagates. Also, monitor your Claude usage and keep an eye on GitHub issue #38335 for updates from people actually tracking the underlying bug. For a $200/month product, "it's probably a caching bug, check back later" is not an acceptable answer. Anthropic knows this. Now they need to act like it.

Related Articles