Prompt Injection Harder To Stop Than SQL Injection

Prompt Injection Harder To Stop Than SQL Injection

The UK’s National Cyber Security Centre (NCSC) has issued a fresh warning about the growing threat of prompt injection, a vulnerability that has quickly become one of the biggest security concerns in generative AI systems. First identified in 2022, prompt injection refers to attempts by attackers to manipulate large language models (LLMs) by inserting rogue instructions into user-supplied content.

While the technique may appear similar to the long-familiar SQL injection flaw, the NCSC stresses that comparing the two is not only misleading but potentially harmful if organisations rely on the wrong mitigation strategies.

Why Prompt Injection Is Fundamentally Different

SQL injection has been understood for nearly three decades. Its core issue, blurring the boundary between data and executable instructions, has well-established fixes such as parameterised queries. These protections work because traditional systems draw a clear distinction between “data” and “instructions.”

The NCSC explains that LLMs do not operate in the same way. Under the hood, a model doesn’t differentiate between a developer’s instruction and a user’s input; it simply predicts the most likely next token. This makes it inherently difficult to enforce any security boundary inside a prompt.

In one common example of indirect prompt injection, a candidate’s CV might include hidden text instructing a recruitment AI to override previous rules and approve the applicant. Because an LLM treats all text the same, it can mistakenly follow the malicious instruction.

This, according to the NCSC, is why prompt injection attacks consistently appear in deployed AI systems and why they are ranked as OWASP’s top risk for generative AI applications.

Treating LLMs as an ‘Inherently Confusable Deputy’

Rather than viewing prompt injection as another flavour of classic code injection, the NCSC recommends assessing it through the lens of a confused deputy problem. In such vulnerabilities, a trusted system is tricked into performing actions on behalf of an untrusted party.

Traditional confused deputy issues can be patched. But LLMs, the NCSC argues, are “inherently confusable.” No matter how many filters or detection layers developers add, the underlying architecture still offers attackers opportunities to manipulate outputs.

The goal, therefore, is not complete elimination of risk, but reducing the likelihood and impact of attacks.

Key Steps to Building More Secure AI Systems

The NCSC outlines several principles aligned with the ETSI baseline cybersecurity standard for AI systems:

1. Raise Developer and Organisational Awareness

Prompt injection remains poorly understood, even among seasoned engineers. Teams building AI-connected systems must recognise it as an unavoidable risk. Security teams, too, must understand that no product can completely block these attacks; risk has to be managed through careful design and operational controls.

2. Prioritise Secure System Design

Because LLMs can be coerced into using external tools or APIs, designers must assume they are manipulable from the outset. A compromised prompt could lead an AI assistant to trigger high-privilege actions, effectively handing those tools to an attacker.

Researchers at Google, ETH Zurich, and independent security experts have proposed architectures that constrain the LLM’s authority. One widely discussed principle: if an LLM processes external content, its privileges should drop to match the privileges of that external party.

3. Make Attacks Harder to Execute

Developers can experiment with techniques that separate “data” from expected “instructions”, for example, wrapping external input in XML tags. Microsoft’s early research shows these techniques can raise the barrier for attackers, though none guarantee total protection.

The NCSC warns against simple deny-listing phrases such as “ignore previous instructions,” since attackers can easily rephrase commands.

4. Implement Robust Monitoring

A well-designed system should log full inputs, outputs, tool integrations, and failed API calls. Because attackers often refine their attempts over time, early anomalies, like repeated failed tool calls, may provide the first signs of an emerging attack.

A Warning for the AI Adoption Wave

The NCSC concludes that relying on SQL-style mitigations would be a serious mistake. SQL injection saw its peak in the early 2010s after widespread adoption of database-driven applications. It wasn’t until years of breaches and data leaks that secure defaults finally became standard.

With generative AI rapidly embedding itself into business workflows, the agency warns that a similar wave of exploitation could occur, unless organisations design systems with prompt injection risks front and center.



Source link