Artificial intelligence systems can automatically generate functional exploits for newly published Common Vulnerabilities and Exposures (CVEs) in just 10-15 minutes at approximately $1 per exploit.
This breakthrough significantly compresses the traditional “grace period” that defenders typically rely on to patch vulnerabilities before working exploits become available.
The research, conducted by security experts Efi Weiss and Nahman Khayet, reveals that their AI system can process the daily stream of 130+ newly published CVEs far faster than human researchers.
Key Takeaways
1. AI generates working CVE exploits in 10-15 minutes for $1 each.
2. Automated three-stage system analyzes CVEs, creates exploits, and validates results.
3. Defenders must now respond in minutes instead of weeks.
The implications are profound for cybersecurity defenders who historically enjoyed hours, days, or even weeks before public exploits emerged for known vulnerabilities.
AI-Powered Exploit Generation
The researchers developed a sophisticated three-stage pipeline that combines Large Language Models (LLMs) with automated testing environments.
The system begins by analyzing CVE advisories and GitHub Security Advisory (GHSA) data, extracting crucial information including affected repositories, vulnerable versions, and patch details.
The first stage involves technical analysis where the AI examines the vulnerability advisory and corresponding code patches.
For example, when processing CVE-2025-54887, a cryptographic bypass affecting JWT encryption, the system identified the specific attack vector and created a comprehensive exploitation plan.
Iterative vulnerability exploitation cycle
The second stage implements a test-driven approach using separate AI agents for creating vulnerable applications and exploit code.
The researchers discovered that using specialized agents prevented confusion between different tasks.
They employed Dagger containers to create secure sandboxes for testing, enabling the system to validate exploits against both vulnerable and patched versions to eliminate false positives.
The validation loop proved critical, as initial attempts often produced “false positive” exploits that worked against both vulnerable and secure implementations.
The system iteratively refines both the vulnerable test application and exploit code until achieving genuine exploitation.
Exploit
The research produced working exploits for various vulnerability types across different programming languages.
Notable examples include GHSA-w2cq-g8g3-gm83, a JavaScript prototype pollution vulnerability, and GHSA-9gvj-pp9x-gcfr, a Python pickle sanitization bypass.
The team utilized Claude Sonnet 4.0 as their primary model after finding that Software-as-a-Service (SaaS) models’ initial guardrails could be bypassed through carefully structured prompt chains.
They implemented caching mechanisms and type-safe interfaces using pydantic-ai to optimize performance and reliability.
All generated exploits are timestamped using OpenTimestamps blockchain verification and made publicly available.
The researchers emphasize that traditional “7-day critical vulnerability fix” policies may become obsolete as AI capabilities advance, forcing defenders to dramatically accelerate their response times from weeks to minutes.
This development represents a significant shift in the cybersecurity landscape, where the automation of exploit development could fundamentally alter the balance between attackers and defenders in the ongoing cybersecurity arms race.
Safely detonate suspicious files to uncover threats, enrich your investigations, and cut incident response time. Start with an ANYRUN sandbox trial →
Source link