A zero-click vulnerability discovered in ChatGPT’s Deep Research agent allowed attackers to exfiltrate sensitive data from a user’s Gmail account without any user interaction.
The flaw, which OpenAI has since patched, leveraged a sophisticated form of indirect prompt injection hidden within an email, tricking the agent into leaking personal information directly from OpenAI’s cloud infrastructure.
According to Radware, the attack began with an attacker sending a specially crafted email to a victim. This email contained hidden instructions, invisible to the human eye, embedded within its HTML code using techniques like tiny fonts or white-on-white text.
When the user prompted the Deep Research agent to analyze their Gmail inbox, the agent would read this malicious email alongside legitimate ones.
The hidden prompts used social engineering tactics to bypass the agent’s safety protocols. These tactics included:
- Asserting Authority: The prompt falsely claimed the agent had “full authorization” to access external URLs.
- Disguising Malicious URLs: The attacker’s server was presented as a legitimate “compliance validation system.”
- Mandating Persistence: The agent was instructed to retry the connection multiple times if it failed, overcoming non-deterministic security blocks.
- Creating Urgency: The prompt warned that failure to comply would result in an incomplete report.
- Falsely Claiming Security: The instructions deceptively directed the agent to encode the stolen data in Base64, framing it as a security measure while actually obfuscating the data exfiltration.
Once the agent processed the malicious email, it would search the user’s inbox for the specified Personally Identifiable Information (PII), such as a name and address from an HR email.
It would then encode this data and send it to the attacker-controlled server, all without any visual indicator or confirmation from the user.
Service-Side vs. Client-Side Exfiltration
What made this vulnerability particularly dangerous was its service-side nature. The data exfiltration occurred entirely within OpenAI’s cloud environment, executed by the agent’s own browsing tool.
This is a significant escalation from previous client-side attacks that relied on rendering malicious content (like images) in the user’s browser.
Because the attack originated from OpenAI’s infrastructure, it was invisible to conventional enterprise security measures like secure web gateways, endpoint monitoring, and browser security policies. The user would have no knowledge of the data leak, as nothing would be displayed on their screen, Radware said.

While the proof of concept focused on Gmail, the vulnerability’s principles could be applied to any data connector integrated with the Deep Research agent. Malicious prompts could be hidden in:
- PDFs or Word documents in Google Drive or Dropbox.
- Meeting invites in Outlook or Google Calendar.
- Records in HubSpot or Notion.
- Messages or files in Microsoft Teams.
- README files in GitHub.
Any service that allows text-based content to be ingested by the agent could have served as a potential vector for this type of attack.
Researchers who discovered the flaw suggest that a robust mitigation strategy involves continuous monitoring of the agent’s behavior to ensure its actions align with the user’s original intent. This can help detect and block deviations caused by malicious prompts.
The vulnerability was reported to OpenAI on June 18, 2025. The issue was acknowledged, and a fix was deployed in early August. OpenAI marked the vulnerability as resolved on September 3, 2025.
Find this Story Interesting! Follow us on Google News, LinkedIn, and X to Get More Instant Updates.
Source link