Artificial intelligence has officially entered the realm of advanced vulnerability research, moving beyond simple code assistance to autonomous threat hunting.
This highly accelerated discovery rate outpaces traditional manual research, with the AI uncovering more vulnerabilities in one month than human researchers reported in any single month of 2025.
Fourteen of these discoveries were classified as high-severity, representing nearly twenty percent of all critical Firefox flaws patched the previous year.
In a groundbreaking collaboration between Anthropic and Mozilla, the Claude Opus 4.6 model independently discovered 22 security flaws in Firefox over a two-week period in February 2026.
Anthropic initially tested Claude on the CyberGym benchmark to reproduce historical Common Vulnerabilities and Exposures before unleashing it on the live Firefox codebase.
The research team strategically targeted the browser’s JavaScript engine because its wide attack surface routinely processes untrusted external code.
Within just twenty minutes of autonomous exploration, Claude identified a critical Use After Free memory vulnerability. This specific class of memory corruption allows attackers to overwrite data with arbitrary malicious payloads.
By the end of the project, the AI had scanned nearly 6,000 C++ files and submitted 112 unique reports to Bugzilla.
AI Exploit Capabilities and Limits
While Claude excels at uncovering zero-day flaws, weaponizing them into functional exploits remains highly inefficient.
Anthropic researchers tasked the model with developing primitive exploits to read and write local files on a target machine.
Out of hundreds of automated attempts costing $4,000 in API credits, Claude successfully breached the system only twice.
This demonstrates that identifying vulnerabilities is currently an order of magnitude cheaper than exploiting them.
Furthermore, these attacks only succeeded in a restricted testing environment that intentionally bypassed Firefox’s sandbox protections.
The browser’s standard defense-in-depth architecture would have successfully mitigated these specific AI-generated exploits in real-world scenarios.
| Metric | Technical Discovery Details |
|---|---|
| Target Component | Firefox JavaScript Engine (C++ codebase) |
| Initial Discovery | Use After Free memory corruption |
| Scan Volume | 6,000 C++ files analyzed |
| High-Severity Flaws | 14 vulnerabilities |
| Exploit Success Rate | 2 successful local file read/write attacks |
Mozilla efficiently triaged these bulk submissions and deployed the necessary security patches to hundreds of millions of users in the Firefox 148.0 release.
As frontier language models evolve into world-class vulnerability researchers, defenders must modernize their patching infrastructure to keep pace.
Security teams deploying AI for bug hunting should utilize task verifiers to validate their findings.
These trusted verification mechanisms allow an AI agent to continuously test its own work, confirming that a proposed patch permanently removes the vulnerability while preserving core program functionality.
To streamline the coordinated vulnerability disclosure process, researchers must provide maintainers with actionable data.
The Firefox security team highlighted several required elements for trusting AI-generated bug reports:
- Submit accompanying minimal test cases to quickly isolate the specific crashing input.
- Provide detailed proofs-of-concept that clearly demonstrate the exploit execution path.
- Include verified candidate patches to accelerate the final remediation timeline.
Follow us on Google News, LinkedIn, and X to Get Instant Updates and Set GBH as a Preferred Source in Google.





