On April 7, 2026, Anthropic published a research preview that quietly redrew the boundaries of what a single AI model is capable of. The model is called Claude Mythos. Its specialty is finding software security vulnerabilities. And in pre-release testing, it found them by the thousand, across every major operating system, every major web browser, and a long tail of open-source projects that underpin much of the modern internet. The bugs it surfaced ranged from logic flaws written last month to a remote denial-of-service vulnerability in OpenBSD's TCP stack that had survived since 1998.

Then Anthropic did something unusual for the AI industry. The company did not ship Mythos Preview to the public. It did not put it behind an API gate, did not list it in the model picker, did not announce an opt-in waitlist. Instead, it launched Project Glasswing, a closed coalition of operating-system vendors, browser makers, cloud providers, and a small group of security firms, and granted them access to Mythos so that the most critical infrastructure on the internet could be patched before the model's capabilities became widely available.

It is the most consequential decision Anthropic has made since the founding of the company, and it forces a question the AI industry has been circling for years: what do you do when a model you have built can do something genuinely dangerous, but also genuinely valuable?

Claude Mythos Preview by the Numbers

  • Critical-severity zero-days found1,000+
  • Working exploits on first attempt83%
  • Firefox 147 JS exploits (vs. Opus 4.6's 2)181
  • OS coverageEvery major OS & browser
  • Oldest bug surfaced27 years (OpenBSD, 1998)

From Near-Zero to Eighty-Three Percent

To appreciate how large a step Claude Mythos represents, it helps to remember what was true a year ago. Autonomous exploit development, the task of taking a vulnerable program, finding the bug, writing a working exploit, and chaining it into a useful attack, has historically been one of the few capability ceilings AI models could not break. The work demands a combination of deep code understanding, creative search over enormous state spaces, and the kind of low-level intuition that takes human researchers years to develop. As recently as Claude Opus 4.6, Anthropic's previous frontier model, this capability sat at what Anthropic describes as a "near-0%" success rate on the internal exploit-development benchmark.

Mythos Preview did not improve on that number. It blew through it. On a battery of tests built around the OSS-Fuzz corpus of around 1,000 real open-source projects, the model produced 595 crashing inputs at the lower difficulty tiers and a small but meaningful number at the highest tier, full control-flow hijack: the technical bar at which an attacker can actually take over a process. On a hardened build of Firefox 147 with all modern mitigations enabled, Mythos succeeded at producing working JavaScript exploits 181 times, against the prior model's two. On Linux kernel privilege escalation, it autonomously chained two to four vulnerabilities into complete exploit paths.

The specific findings are even more striking than the totals. Mythos rediscovered a 27-year-old vulnerability in OpenBSD's TCP SACK implementation, sitting in production code that had been reviewed by some of the most rigorous open-source maintainers in the world. It found a 16-year-old flaw in FFmpeg's H.264 codec. It surfaced a remote code execution path in FreeBSD's NFS implementation that has now been assigned CVE-2026-4747. And it did all of this autonomously, with an agentic scaffolding built on Claude Code, running unattended over batches of repositories.

"Over 99% of the vulnerabilities we've found have not yet been patched. In the short term, this could matter a great deal to attackers, if frontier labs aren't careful about how they release these models." Anthropic, Claude Mythos Preview announcement, April 2026

Why Anthropic Held It Back

Anthropic's reasoning for the limited release is laid out in unusually plain language. The company estimates that more than ninety-nine percent of the vulnerabilities Mythos has surfaced are still unpatched in the wild. Releasing the model publicly today, before defenders have caught up, would create an asymmetric risk: any attacker with API access could in principle replicate the same discovery process, but defenders would not have time to ship fixes. The window between disclosure and mitigation, normally measured in months, would compress to days.

Project Glasswing is Anthropic's attempt to use that compression in the defender's favor instead of the attacker's. Under the program, Mythos Preview is being applied, under strict access controls and human review, to the codebases that matter most: the operating systems most enterprises run, the browsers most users browse with, the cryptographic libraries that secure most internet traffic. Anthropic has committed to standard responsible-disclosure timelines for the findings, with a 90-day window before public disclosure and a 45-day extension available for complex fixes.

The structure is familiar to anyone who has worked in coordinated vulnerability disclosure. What is new is the scale and the source. A single research model is generating findings at a rate that no individual security researcher, and arguably no security firm, could match. That changes the operational picture for every party in the disclosure chain.

The Glasswing Coalition

Anthropic has not published a full roster of Glasswing participants, but the program is structured around the major operating-system, browser, and cloud vendors, along with a tier of dedicated security firms. The intent is that any critical infrastructure software with broad reach is represented, so that fixes can be coordinated across the entire stack rather than landing in one project at a time. Independent security evaluators have also been brought in, including the UK AI Safety Institute, which has published its own assessment of Mythos's cyber capabilities alongside Anthropic's announcement.

For organizations not in the coalition, the practical effect of Glasswing is that they will see waves of security patches over the coming months whose underlying findings will not be attributed in public until disclosure windows expire. Anthropic has explicitly asked the security community to be patient with the resulting opacity. The alternative, the company argues, is broadcasting attack vectors before defenders can fix them.

What This Means for the Rest of the Industry

Mythos is the clearest case yet of a frontier AI lab making a unilateral non-release decision on capability grounds. It will not be the last. Every lab competing at the frontier is now confronted with a question that previously felt theoretical: at what capability threshold does a model become too dangerous to ship, even to paying customers, even under a usage policy? Anthropic's answer, in the case of Mythos, is that the threshold is not abstract. It is reached the moment a model can reliably produce working exploits against software that billions of people depend on.

The decision also has direct implications for AI governance. Regulators in both the European Union and the United States have spent the past eighteen months building frameworks for evaluating frontier models against capability-based risk thresholds. The EU AI Act, in particular, is structured around the idea that some classes of AI capability warrant restrictions independent of how the model is used. Mythos is a working example of how that principle plays out: a capability so consequential that the developer voluntarily withdrew it from the open market, and offered a coordinated industry response in its place.

It is also a vindication of the position Anthropic has staked out since its founding. The company's Responsible Scaling Policy commits it to evaluate frontier models against cyber, biological, and autonomy thresholds before deployment. Until Mythos, those thresholds were largely policy documents, important but untested. With Mythos, the policy collided with reality, and Anthropic chose the more restrictive path. For an industry that has grown accustomed to capability releases racing each other to market, this is a substantive shift.

The Open Questions

Several questions remain open. The first is whether non-coalition labs will reach similar capabilities, and if so, whether they will mirror Anthropic's restraint. The second is what happens when the Glasswing program ends or when Mythos's successor is ready. Anthropic has not committed to a specific timeline for a broader release, but has indicated that public availability is a question of "when defenders catch up," not "if." The third is whether responsible disclosure at this scale is sustainable. Maintainers of open-source projects are already operating at the edge of their capacity, and an AI model that produces critical findings faster than projects can triage them is a new kind of operational stress.

For now, the lesson of Claude Mythos is simpler than any of those questions. A frontier AI model has demonstrated the ability to do work that, at its scale and accuracy, no human team could match. The developer chose to use that ability defensively first, offensively never, and to accept commercial cost in exchange for systemic safety. Whether the rest of the industry follows that example is, as much as anything else, the story of the next year of AI.

Further reading: Learn how Constitutional AI v2 shapes Anthropic's safety stack, see the Claude 4 Opus benchmark results, or browse the latest Claude AI news.