Government Data Stolen After Hacker Jailbreaks Claude AI to Write Malicious Exploit Code

February 26, 2026 2 min read

A hacker successfully manipulated Anthropic’s Claude AI to launch a sophisticated month-long cyberattack against Mexican government agencies.

Between December 2025 and January 2026, the attacker utilized “jailbreaking” techniques to bypass safety guardrails, forcing the AI to identify vulnerabilities, generate functional exploit code, and exfiltrate sensitive data.

The Jailbreak Method

Cybersecurity firm Gambit Security revealed that the attacker used persistent, Spanish-language prompts to trick the AI.

By framing requests as a simulated “bug bounty program” and asking the AI to role-play as an “elite hacker,” the attacker overcame Claude’s initial refusals.

Once the safety protocols were bypassed, Claude produced thousands of detailed reports containing executable scripts for network scanning, SQL injection, and credential stuffing.

When the attacker reached Claude’s operational limits, they pivoted to ChatGPT to generate strategies for lateral movement and evasion.

The operation targeted legacy infrastructure and unpatched web applications, as reported by Cybersecuritynews.

The breach compromised at least 20 vulnerabilities across federal and state systems, resulting in the exfiltration of approximately 150GB of sensitive data.

Target Entity	Data Stolen	Volume/Details
Federal Tax Authority (SAT)	Taxpayer records	195 million records
National Electoral Institute (INE)	Voter records	Sensitive voter data
State Governments	Employee credentials	Jalisco, Michoacán, Tamaulipas
Monterrey Water Utility	Civil files, operational data	Part of 150GB total

This incident underscores the rise of “agentic” AI threats, where advanced hacking capabilities are democratized for solo operators without complex infrastructure.

Gambit Security noted that the AI provided step-by-step attack plans, lowering the barrier to entry for cybercrime.

Anthropic investigated the breach, banned the associated accounts, and enhanced Claude Opus 4.6 with real-time misuse probes.

While federal agencies are assessing the damage, some entities, including the state of Jalisco, have denied the breach.

The event drew attention from tech leaders, with Elon Musk highlighting the risks of AI-orchestrated crime on X.

Experts are now urging governments to prioritise patching legacy systems and implementing behavioral monitoring for AI interactions.

Follow us on Google News, LinkedIn, and X to Get Instant Updates and Set GBH as a Preferred Source in Google.

Source link

The Jailbreak Method

Related Articles

Royal Enfield Reportedly Targeted in Ransomware Attack, Hackers Claim Data Encryption

Symantec DLP Agent Flaw Exposed Systems to Privilege Escalation Attacks

New Gootloader Malware Abuses RDP to Spread Rapidly

Windows 11 25H2 Preview Build Released: Here’s What’s New

Linux Boot Vulnerability Lets Attackers Bypass Secure Boot Protections

NVIDIA GPU Driver Vulnerability Opens Door to Elevated Privileges