
LayerX is calling the flaw “ClaudeBleed.”
“LayerX reported the flaw to Anthropic,” LayerX researcher Aviad Gispan said in a blog post. “Anthropic replied that they were already aware of the issue and that it would be fixed in the next version of the extension.” However, Gispan added, Anthropic’s fix was partial, and the flaw can still be exploited.
The post demonstrated different ways the flaw can still be exploited, including sending a file from a Google Drive folder to an outsider, sending an email on behalf of a remote attacker, stealing code from a private repository on GitHub, and summarizing emails and sending them to an external user.
“ClaudeBleed is a useful demonstration of why monitoring AI agents at the prompt layer is fundamentally insufficient,” said Ax Sharma, head of research at Manifold Security. “The most sophisticated part of this attack isn’t the injection, but that the agent’s perceived environment was manipulated to produce actions that looked legitimate from the inside. That’s the class of threat the industry needs to be building defenses for.”
