Key Takeaways
- Sensitive data shared with ChatGPT conversations could be silently exfiltrated without the user’s knowledge or approval.
- Check Point Research discovered a hidden outbound communication path from ChatGPT’s isolated execution runtime to the public internet.
- A single malicious prompt could turn an otherwise ordinary conversation into a covert exfiltration channel, leaking user messages, uploaded files, and other sensitive content.
- A backdoored GPT could abuse the same weakness to obtain access to user data without the user’s awareness or consent.
- The same hidden communication path could also be used to establish remote shell access inside the Linux runtime used for code execution.
What Happened
AI assistants now handle some of the most sensitive data people own. Users discuss symptoms and medical history. They ask questions about taxes, debts, and personal finances, upload PDFs, contracts, lab results, and identity-rich documents that contain names, addresses, account details, and private records. That trust depends on a simple expectation: data shared in the conversation remains inside the system.
ChatGPT itself presents outbound data sharing as something restricted, visible, and controlled. Potentially sensitive data is not supposed to be sent to arbitrary third parties simply because a prompt requests it. External actions are expected to be mediated through explicit safeguards, and direct outbound access from the code-execution environment is restricted.
Our research uncovered a path around that model.
We found that a single malicious prompt could activate a hidden exfiltration channel inside a regular ChatGPT conversation.
The Intended Safeguards
ChatGPT includes useful tools that can retrieve information from the internet and execute Python code. At the same time, OpenAI has built safeguards around those capabilities to protect user data. For example, the web-search capability does not allow sensitive chat content to be transmitted outward through crafted query strings. The Python-based Data Analysis environment was designed to prevent internet access as well. OpenAI describes that environment as a secure code execution runtime that cannot generate direct outbound network requests.

OpenAI also documents that so called GPTs can send relevant parts of a user’s input to external services through APIs. A GPT is a customized version of ChatGPT that can be configured with instructions, knowledge files, and external integrations. GPT “Actions” provide a legitimate way to call third-party APIs and exchange data with outside services. Actions are useful for enterprise workflows, access to internal business systems, customer support operations, and other integrations that connect ChatGPT to external services, including simpler use cases such as travel or weather lookups. The key point is visibility: the user sees that data is about to leave ChatGPT, sees where it is going, and decides whether to allow it.

In other words, legitimate outbound data flows are designed to happen through an explicit, user-facing approval process.
From One Message to Silent Exfiltration
From a security perspective, the obvious attack surfaces looked strong. The ability to send chat data through tools not designed for that purpose was strictly limited. Sending data through a legitimate GPT integration using external API calls also required explicit user confirmation.
The vulnerability we discovered allowed information to be transmitted to an external server through a side channel originating from the container used by ChatGPT for code execution and data analysis. Crucially, because the model operated under the assumption that this environment could not send data outward directly, it did not recognize that behavior as an external data transfer requiring resistance or user mediation. As a result, the leakage did not trigger warnings about data leaving the conversation, did not require explicit user confirmation, and remained largely invisible from the user’s perspective.
At a high level, the attack began when the victim sent a single malicious prompt into a ChatGPT conversation. From that moment on, each new message in the chat became a potential source of leakage. The scope of that leakage depended on how the prompt framed the task for the model: it could include raw user text, text extracted from uploaded files, or selected model-generated output such as summaries, medical assessments, conclusions, and other condensed intelligence. This made the attack flexible, because it allowed the attacker to target not only original user data, but also the most valuable information produced by the model itself.
That attack pattern fits naturally into ordinary user behavior. The internet is full of websites, blog posts, forums, and social media threads promoting “top prompts for productivity,” “best prompts for work,” and other ready-made instructions that supposedly improve ChatGPT’s performance. For many users, copying and pasting such prompts into a new conversation is routine and does not appear risky, because the prevailing expectation is that AI assistants will not silently leak conversation data to external parties, and that this boundary cannot be changed through an ordinary prompt. A malicious prompt distributed in that format could therefore be presented as a harmless productivity aid and interpreted as just another useful trick for getting better results from the assistant.
A broader campaign could use an even more convincing lure: prompts advertised as a way to unlock premium capabilities for free. Claims about enabling Pro-level behavior, hidden modes, or advanced features on a lower-tier account would give the attacker a natural pretext for including unusual instructions, long text blocks, or opaque fragments that might otherwise seem suspicious. Because the user already expects a nonstandard sequence, those elements can easily be perceived as part of the promised “hack.” A carefully crafted prompt could then create the appearance of enhanced functionality while quietly turning the conversation into a source of exfiltrated data.

Once the malicious prompt was placed into the chat, the conversation effectively became a covert collection channel. From that point on, summaries of subsequent user messages could be silently transmitted to an attacker-controlled server. As shown in Video 1, this took place without any warning, approval request, or other visible indication of external data transfer.
Malicious GPTs
The same attack pattern becomes even more dangerous when embedded inside a custom GPT.
GPTs allow developers to package instructions, knowledge files, and external integrations into a reusable assistant that other users can interact with. From the user’s perspective, the interaction looks like a normal ChatGPT conversation with a specialized tool.
In that scenario, the attacker no longer needs to rely on the victim copying a prompt from an external source. The malicious logic can be embedded directly in the GPT’s instructions and files. A user only needs to open the GPT and begin interacting with it as intended.
This changes the threat model significantly. Officially, GPT builders do not have access to individual user conversations with their GPTs. However, a GPT designed to exploit the vulnerability could still cause selected information from those conversations to be transmitted to an attacker-controlled server. From the user’s perspective, the interaction would remain indistinguishable from an ordinary session with a specialized assistant.
To demonstrate the practical impact of the vulnerability, we built a proof of concept around a sensitive scenario: a GPT acting as a personal doctor.
In the demonstration, the user uploaded a PDF containing laboratory test results. The document also contained personal information, including the patient’s name. The user then described symptoms and asked the GPT to help interpret the results.
From the user’s perspective, the interaction looked completely ordinary. The GPT analyzed the document and produced a medical assessment based on the provided information.
When asked directly whether the uploaded data had been sent anywhere, ChatGPT answered confidently that it had not, explaining that the file was only stored in a secure internal location. At the same time, the user saw no approval prompts or warnings about external data transfer, unlike the confirmation dialogs that normally appear when a GPT Action sends information to a third-party service.

Meanwhile, the attacker’s server received highly sensitive data extracted from the conversation: the patient’s identity taken from the uploaded document together with the model’s medical assessment.
This illustrates an important aspect of the attack. The attacker does not necessarily need to steal entire documents. Instead, the prompt could instruct the model to transmit the most valuable information it produces. In the medical scenario, that meant the patient’s identifying details together with the model’s assessment. In other contexts, it could mean financial conclusions, contract summaries, or strategic insights extracted from long documents.
From Data Exfiltration to Remote Shell
The same communication channel could be used for more than silent data exfiltration.
Once a reliable bidirectional channel existed between the execution runtime and the attacker-controlled server, it became possible to send commands into the container and receive the results back through the same path. In effect, the attacker could establish a remote shell inside the Linux environment that ChatGPT creates to perform code execution and data analysis tasks.
This interaction happened outside the normal ChatGPT response flow. When users interact with the assistant through the chat interface, generated actions and outputs remain subject to the model’s safety mechanisms and checks. However, commands executed through the side channel bypassed that mediation entirely. The results were returned directly to the attacker’s server without appearing in the conversation or being filtered by the model.
DNS Tunneling in an AI Runtime
The side channel that enabled both data exfiltration and remote command execution relied on DNS resolution.
Normally, DNS is used to resolve domain names into IP addresses. From a security perspective, however, DNS can also function as a data transport channel. Instead of using DNS only for ordinary name resolution, an attacker can encode data into subdomain labels and trigger resolution of those hostnames. Because DNS resolution propagates the requested hostname through the normal recursive lookup process, the resolver chain can carry that encoded data outward.
In our case, this mattered because the ChatGPT execution runtime did not permit conventional outbound internet access, but DNS resolution was still available as part of the environment’s normal operation. Standard attempts to reach external hosts directly were blocked. DNS, however, still provided a narrow communication path that crossed the isolation boundary indirectly through legitimate resolver infrastructure.
To exfiltrate data, content could be encoded into DNS-safe fragments, placed into subdomains, and reconstructed on the attacker’s side from the incoming queries. To send instructions back, the attacker could encode small command fragments into DNS responses and let them travel back through the same resolution path. A process running inside the container could then read those responses, reassemble the payload, and continue the exchange.

This effectively turned DNS infrastructure into a tunnel between the isolated runtime and an attacker-controlled server. The tunnel create in this way is sufficient for two practical goals: silently leaking selected data from the conversation and maintaining command execution inside the Linux environment created for code execution and data analysis.
Conclusion
Check Point Research reported the issue to OpenAI. OpenAI confirmed that it had already identified the underlying problem internally, and the fix was fully deployed on February 20, 2026.
The broader lesson, however, goes beyond this specific case. AI systems are evolving at an extraordinary pace. New capabilities are constantly being introduced, enabling assistants to solve complex mathematical problems, analyze large datasets, generate and execute scripts, and automate multi-step tasks that previously required dedicated development environments. These capabilities bring enormous benefits. At the same time, every new tool expands the system’s attack surface and can introduce new security challenges for both users and platform providers.
Modern AI assistants increasingly operate as real execution environments. They read files, run code, search in the web while processing highly sensitive information such as medical records, financial data, legal documents, and other personal or organizational data. Protecting these environments requires careful control over every possible outbound communication path, including infrastructure layers that users never see.
As AI tools become more powerful and widely used, security must remain a central consideration. These systems offer enormous benefits, but adopting them safely requires careful attention to every layer of the platform.

