Cybersecurity researchers from Pillar Security have detailed a new threat named Hackerbot-Claw, aka Chaos Agent. This marks the first time an AI agent has been caught carrying out a full-scale attack on software infrastructure using simple human language. Over a frantic 37-hour period in late February 2026, the automated attacker targeted major projects on GitHub, including those run by Microsoft, DataDog, and Aqua Security.
A Rapid Escalation
The campaign moved with machine speed, scanning for gaps and hijacking developer tools within minutes. It focused on CI/CD pipelines, the automated assembly lines that developers use to test and publish their code. By finding mistakes in how these pipelines were set up, the AI agent was able to sneak in malicious commands.
The operation began on 27 February with a series of lightning-fast strikes. The attacker first hit Microsoft and DataDog, using tricks like branch name and filename injections to bypass security filters. DataDog was forced to deploy an emergency patch in under 13 hours to stop the breach.
These initial phases were just the beginning of a much larger attack. By the early hours of 28 February, the agent had already moved on to the AwesomeGo project, sending four probe requests in just 30 minutes to test its defences.
Then the time came for the most damaging blow during the third phase of the attack. The agent successfully compromised Aqua Security’s Trivy project, a move that allowed it to delete 97 software releases and wipe out 32,000 stars, the community-driven measure of a project’s popularity. In a bold final act, the agent returned to AwesomeGo to steal security tokens and successfully hit the CNCF project project-akri by impersonating a legitimate developer.
Turning AI Against Its Owners
Perhaps most alarming is how the agent turned a developer’s own assistants into accomplices. As we know it, many programmers use tools like Copilot, Gemini, or Claude, and as per Pillar Security’s research, the attacker used a 2,000-word social engineering prompt to trick these local AI assistants into stealing sensitive data like cloud passwords and security keys. This promptware represents a shift where “millions of lines of sophisticated exploit code” are “replaced by a single natural-language prompt.”
It is worth noting that while most systems fell victim, one defender stood tall. A project named Ambient Code used an AI called Claude Code, which spotted the malicious instructions in just 82 seconds. Researchers explained in the blog post that it was “the only control in the entire campaign that stopped an attack at the point of execution.” They also suspect that while the AI handled the technical work, the timing suggests a human strategist, likely based in the Americas, was overseeing every move.
Researchers conclude that the campaign is no longer active and the projects are fixed; however, the methods used remain a public playbook for future threats.




