
AI Safety and AGI: Anthropic's Warning on Artificial General Intelligence
Explore Anthropic co-founder Jack Clark's concerns about AI safety, situational awareness in large language models, and the regulatory landscape shaping the fut...
Anthropic’s Project Glasswing uses its most powerful AI model to find thousands of zero-day vulnerabilities in critical software. Here’s everything developers and security professionals need to know.
Anthropic just launched Project Glasswing — a cybersecurity initiative that pairs the world’s biggest tech companies with an AI model powerful enough to find vulnerabilities that have been hiding in critical software for decades. The model has already discovered thousands of zero-day vulnerabilities, including bugs in every major operating system and web browser.
This isn’t a product announcement or a new API feature. It’s a coordinated defensive effort built on the premise that AI-powered cyberattacks are coming, and the best defense is finding the vulnerabilities first.
Project Glasswing is a cross-industry cybersecurity initiative launched by Anthropic on April 7, 2026. Its core mission: use AI to find and patch vulnerabilities in critical software infrastructure before attackers can exploit them.
The initiative is powered by Claude Mythos Preview, Anthropic’s most advanced unreleased frontier model. Unlike previous Claude models, Mythos has emergent capabilities in vulnerability discovery and exploit development that represent a qualitative leap — not from explicit security training, but from general improvements in code reasoning.
Anthropic’s argument is straightforward: AI models have reached a capability level where they surpass most humans at finding and exploiting software vulnerabilities. As these capabilities proliferate, malicious actors will inevitably gain access. The fallout — for economies, public safety, and national security — could be severe. Project Glasswing is the preemptive response: use that same power defensively.
The results are striking. Claude Mythos Preview has already discovered thousands of zero-day vulnerabilities — bugs that have gone undetected for years, sometimes decades:
| Vulnerability | Software | Age | Details |
|---|---|---|---|
| Signed integer overflow in SACK implementation | OpenBSD | 27 years | Network stack vulnerability |
| H.264 codec exploit via slice sentinel collision | FFmpeg | 16 years | Media processing vulnerability |
| Guest-to-host memory corruption | Production memory-safe VMM | — | Hypervisor escape |
| Multiple vulnerabilities | Every major OS and web browser | Various | Across the full stack |
And it doesn’t just find bugs — it develops working exploits:
Less than 1% of discovered vulnerabilities have been patched so far. Anthropic uses a 90+45 day responsible disclosure timeline and SHA-3 commitment hashes to prove possession of vulnerability details without revealing them.
Claude Mythos Preview is not just incrementally better — it represents a capability jump in code security analysis.
| Benchmark | Mythos Preview | Opus 4.6 | Delta |
|---|---|---|---|
| CyberGym (vulnerability analysis) | 83.1% | 66.6% | +16.5 |
| SWE-bench Pro | 77.8% | 53.4% | +24.4 |
| SWE-bench Verified | 93.9% | 80.8% | +13.1 |
| BrowseComp | 86.9% | 83.7% | +3.2 |
| GPQA Diamond (scientific reasoning) | 94.6% | 91.3% | +3.3 |
| Humanity’s Last Exam (no tools) | 56.8% | 40.0% | +16.8 |
| Humanity’s Last Exam (with tools) | 64.7% | 53.1% | +11.6 |
The security gap is dramatic. In an OSS-Fuzz corpus test with 7,000 entry points, Mythos achieved 595 crashes at tiers 1-2, with 10 full control flow hijacks. Against Firefox 147’s JavaScript engine, it developed 181 working exploits — compared to just 2 from Opus 4.6.
Anthropic’s red team notes that “Opus 4.6 had a near-0% success rate at autonomous exploit development.” Mythos didn’t get these capabilities from specialized security training — they emerged from general improvements in code reasoning. That’s what makes this both powerful and concerning.
The model operates within an agentic scaffold:
This isn’t a static scanner. It’s an autonomous agent that reasons about code behavior, distinguishes intended vs. actual functionality, and identifies logic vulnerabilities like authentication bypasses — not just memory corruption patterns.
Project Glasswing is not a general-purpose developer tool. Access is deliberately restricted:
Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks.
Approximately 40 additional organizations responsible for critical software infrastructure also have access.
If you maintain a public repository with 5,000+ GitHub stars or 1M+ monthly NPM downloads, you can apply through the Claude for Open Source program.
This is the most accessible path for individual developers. The program provides Claude access specifically for security analysis of open-source projects.
An upcoming Cyber Verification Program will allow legitimate security professionals to apply for access. Details haven’t been announced yet, but this will likely require professional credentials or organizational affiliation.
Claude Mythos Preview is available in gated research preview through Amazon Bedrock with enterprise-grade security controls — customer-managed encryption, VPC isolation, and detailed logging.
After the research preview, API pricing will be $25 / $125 per million input/output tokens through the Claude API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry.
Even if you don’t have direct access to Project Glasswing, its implications are significant:
Your dependencies will get more secure. Project Glasswing is scanning the software that everything else is built on — operating systems, browsers, media codecs, network stacks, hypervisors. Patches flowing from this initiative will improve the security of the entire ecosystem.
The vulnerability landscape is changing. AI can now find bugs that decades of human review missed. This raises the bar for what “secure code” means and accelerates the timeline on which known vulnerability classes get discovered and patched.
AI-powered security tools are coming. What Mythos can do today in a restricted setting, other models will approach in the coming years. Security-aware development practices and tooling will become table stakes.
Open-source gets disproportionate benefit. Anthropic has committed $2.5 million to Alpha-Omega and OpenSSF via the Linux Foundation, plus $1.5 million to the Apache Software Foundation. Combined with $100 million in model usage credits for participants, this is a substantial investment in open-source security.
Not everyone is enthusiastic. Community reactions have been mixed:
Selective access concerns. Critics argue that restricting access to big tech companies creates an asymmetry — large organizations get better security while smaller projects and companies are left out. Some see this as contradicting Anthropic’s public benefit corporation status.
Safety questions. Was 24 hours of internal review sufficient before announcing a model this capable? Anthropic argues they’ve been preparing for months, but the compressed public timeline has drawn scrutiny.
Marketing skepticism. Some observers question whether this is partly a marketing exercise ahead of Anthropic’s potential IPO, positioning the company as a responsible steward of powerful AI.
The “damned if you do” dynamic. Both releasing the model widely and restricting it have downsides. Wide release risks empowering attackers. Restricted release risks creating a permanent security divide. There’s no clean answer.
Anthropic plans to eventually transition governance of Project Glasswing to “an independent, third-party body” coordinating cybersecurity projects across private and public sectors.
Here are the concrete paths available today:
| Path | Requirements | How to Apply |
|---|---|---|
| Claude for Open Source | 5,000+ GitHub stars or 1M+ NPM downloads | Apply here |
| Cyber Verification Program | Security professional credentials | Coming soon |
| Enterprise (Amazon Bedrock) | Enterprise agreement | Through AWS |
| Launch Partner | Critical infrastructure org | By invitation |
For most developers, the Claude for Open Source program is the realistic entry point. If you maintain a qualifying project, apply now — the program provides Claude access for security analysis of your codebase.
Project Glasswing is the most ambitious AI-powered cybersecurity initiative to date. It pairs an AI model that can find decades-old zero-days autonomously with the organizations responsible for the world’s most critical software.
The restricted access model is controversial but arguably necessary — the same capabilities that make Mythos an exceptional defender would make it an exceptional attacker in the wrong hands. For now, the benefits flow through coordinated disclosure and patching to the entire ecosystem.
For developers, the takeaway is practical: your software’s dependencies are about to get more security scrutiny than they’ve ever had. The vulnerabilities that Mythos is finding today will become patches in the coming months. Keep your dependencies updated, watch for security advisories, and if you maintain a qualifying open-source project, apply for the Claude for Open Source program.
The age of AI-powered vulnerability discovery is here. Project Glasswing is the first coordinated attempt to make sure the defenders move first.
Built with FlowHunt . Stay up to date with the latest developments in AI and cybersecurity on our blog .
Viktor Zeman is a co-owner of QualityUnit. Even after 20 years of leading the company, he remains primarily a software engineer, specializing in AI, programmatic SEO, and backend development. He has contributed to numerous projects, including LiveAgent, PostAffiliatePro, FlowHunt, UrlsLab, and many others.

FlowHunt helps you build automated AI pipelines with enterprise-grade security — using the best models available, including Claude.

Explore Anthropic co-founder Jack Clark's concerns about AI safety, situational awareness in large language models, and the regulatory landscape shaping the fut...

Discover why Anthropic created the Model Context Protocol (MCP), an open-source standard that connects AI models to real-world applications and tools, and why t...

Jailbreaking AI chatbots bypasses safety guardrails to make the model behave outside its intended boundaries. Learn the most common techniques — DAN, role-play,...