Cybersecurity researchers at Pillar Security, an AI software security firm, have found a way to trick Docker’s new AI agent, Ask Gordon, into stealing private information. The researchers discovered that the AI assistant could be manipulated through a method called indirect prompt injection.
This happens because the assistant has a “blind spot” in how it trusts information. As we know it, any AI tool becomes risky when it can access private data, read untrusted content from the web, and talk to external servers.
How Does It Work
Docker is a major platform used by millions of developers to build and share software. To make things easier, they introduced a beta tool, ‘Ask Gordon,’ that can answer questions and help with tasks using simple, natural language.
However, researchers noted attackers could take control of the assistant through a technique known as metadata poisoning. By putting hidden, malicious instructions inside the description or metadata of a software package on the public Docker Hub, hackers can wait for an unsuspecting user to interact with that package.
The research, which was shared with Hackread.com, further revealed that a user only needs to ask a simple question like, “Describe this repo,” for the AI to read those hidden instructions and follow them as if they were legitimate commands.
The Lethal Trifecta
Probing further, researchers found that the AI was falling into a trap because of a lethal trifecta, a term coined by renowned technologist Simon Willison. In this case, the assistant could gather chat history and sensitive build logs (records of the entire software creation process) and send them to a server owned by the attacker.
The results were instant. Within seconds, an attacker could get their hands on build IDs, API keys, and internal network details. This effectively means the agent acted as its own command-and-control client, turning a helpful tool into a weapon against the user.
To explain why this worked, the team used a framework called CFS (Context, Format, and Salience). It shows how a malicious instruction succeeds by fitting the AI’s current task (Context), looking like standard data (Format), and being positioned where the AI gives it high priority (Salience).
A Quick Fix for Users
It is worth noting that this vulnerability (formally known as CWE-1427 or Improper Neutralization of Input for AI Prompts) wasn’t just a theoretical guess; researchers proved it by successfully stealing data during their tests. They immediately notified Docker’s security team, which acted promptly. The issue was officially resolved on November 6, 2025, with the release of Docker Desktop version 4.50.0.
The fix introduces a “human-in-the-loop” (HITL) system. Now, instead of Gordon automatically following instructions it finds online, it must stop and ask the user for explicit permission before it connects to an outside link or executes a sensitive tool. This simple step ensures that the user remains in control of what the AI is actually doing.
