AI5 min readBy Paul Lefizelier

Claude Mythos: Anthropic Accidentally Exposed Its Most Powerful Model — And It's Too Dangerous to Release

A CMS misconfiguration at Anthropic exposed 3,000 internal files revealing Claude Mythos (Capybara), a model above Opus deemed too dangerous for public release due to cybersecurity risks.

Claude Mythos: Anthropic Accidentally Exposed Its Most Powerful Model — And It's Too Dangerous to Release

A CMS misconfiguration. 3,000 internal files publicly accessible. And among them: proof that Anthropic built a model it considers too dangerous to release. Its name: Claude Mythos. Internal codename: Capybara. For the first time, a major AI lab has officially — accidentally — documented that it has surpassed its own ability to safely distribute what it created.

3,000 Files, One Mistake, One Revelation

On March 26, 2026, a configuration error in Anthropic's CMS (Content Management System) made roughly 3,000 unpublished internal files publicly accessible. Draft announcements, images, PDFs, strategy documents — all indexable on the open web, with no authentication required.

The discovery was simultaneous. Bea Nolan from Fortune found the files at the same time as two independent researchers, Alexandre Pauwels and Roy Paz. Anthropic confirmed shortly after: it was a "human error" in the CMS configuration.

What sets this leak apart: these aren't stolen data from a hacker. These aren't rumors from anonymous sources. These are official Anthropic documents describing their own model, in their own words. There is no plausible deniability — no way to deny or downplay what was exposed.

Mythos: Above Opus, in an Entirely New Category

To understand what Claude Mythos represents, you need to know Claude's current lineup. Anthropic offers three model tiers: Haiku (fast and affordable), Sonnet (balanced for daily use), and Opus (the most powerful, built for complex tasks).

Mythos doesn't replace Opus. It creates a new category above it.

The quote found in Anthropic's internal drafts is unambiguous: "Larger and smarter than our Opus models, which were, until now, our most powerful." The internal codename — Capybara — confirms this isn't an Opus 5 or an incremental update. It's something structurally different.

Three areas of outperformance are documented in the leaked files. Cybersecurity first, with the widest gap compared to Opus. Software programming second. And academic reasoning third.

ModelTierUse CaseStatus
Claude HaikuLightFast, affordableAvailable
Claude SonnetBalancedDaily use, APIAvailable
Claude OpusPowerfulComplex tasksAvailable
Claude MythosNew categoryCyber, code, researchRestricted early access

"Too Powerful to Release" — The Precedents That Scare Anthropic

This isn't marketing posturing. Anthropic has specific reasons for refusing a public release.

First precedent: a security test where Claude was turned into a malware factory in just 8 hours. The model generated functional malicious code at an industrial pace.

Second precedent: Anthropic detected and blocked a Chinese state-sponsored campaign using Claude Code to infiltrate 30 organizations. Not a theoretical attack — a real operation, stopped by Anthropic in real time.

Mythos outperforms Opus "dramatically" in cybersecurity. Anthropic describes its cyber capabilities as "unprecedented." If Opus already enabled those scenarios, what does Mythos enable? That's the question Anthropic refuses to address publicly — and precisely why the model won't be available on claude.ai anytime soon.

The Strategy: Defenders First, Everyone Else Later (Maybe)

Anthropic chose an asymmetric deployment — a first in the AI industry.

Mythos won't be available to the public in the near term. Two reasons: high operational costs and a cybersecurity risk profile deemed unacceptable for broad distribution.

Access will be limited to a restricted group of early access clients. Priority goes to cyber defenders — organizations strengthening their security systems. The logic: arm defenders before attackers figure out how to exploit the model.

The irony is striking. The model is so effective at offensive cyber that it must first be given to defenders to level the playing field. An invite-only CEO summit was even planned in the UK, where Dario Amodei was set to present Mythos to CEOs of major European companies.

The Week's Narrative Thread: From "We Achieved AGI" to "Too Dangerous"

This leak didn't happen in a vacuum. It caps a week that will go down in AI history.

DateEventSignal
Mon Mar 23Jensen Huang "we achieved AGI"Abstract declaration
Wed Mar 25Linear "issue tracking is dead" + Figma Canvas AgentsAgents in everyday tools
Thu Mar 26Meta HyperAgentsAI improves its own learning
Fri Mar 27Claude Mythos leakModel too powerful to distribute

Monday, Jensen Huang declared AGI (Artificial General Intelligence) achieved. Abstract, declarative, no technical proof. Thursday, Meta unveiled HyperAgents — an AI that improves its own learning mechanism. Technical, but still at the research stage. Friday, Anthropic exposed a model it refuses to release because it's too capable. Concrete, documented, confirmed.

These are three levels of the same signal. In one week, the AI industry stopped talking about what models will do in the future. It's talking about what they already do — and how to contain them. That's a fundamental shift in register.


Key Takeaways

  • On March 26, 2026, a CMS error at Anthropic exposed ~3,000 internal files revealing Claude Mythos (codename Capybara), an entirely new model tier above the Opus lineup.
  • Anthropic describes Mythos as "by far the most powerful ever built" — outperforming Opus in cybersecurity, coding, and academic reasoning.
  • Anthropic refuses a near-term public release: cyber capabilities deemed "unprecedented" and risk too high for broad distribution.
  • Deployment strategy: access reserved for cyber defenders first, through a restricted early access program.
  • The leak comes 4 days after Jensen Huang declared "we achieved AGI" — giving that claim a technical, documented cybersecurity dimension it previously lacked.

The irony is total. Anthropic is one of the few companies that built its entire identity on AI safety. "Safety first" isn't a slogan — it's why the company exists. And it's precisely Anthropic that just exposed, accidentally, that it built a model so advanced it doesn't know how to distribute it without causing harm. This isn't speculative dystopia. It's an internal document, confirmed, dated March 26, 2026. The question is no longer "when will AI be dangerous?" The question is "how do you distribute what already is?"

#anthropic #claude #claude-mythos #leak #cybersecurity #dario-amodei #capybara #ai-model #safety