Germany’s BSI issues guidelines to counter evasion attacks targeting LLMs

Germany’s BSI warns of rising evasion attacks on LLMs, issuing guidance to help developers and IT managers secure AI systems.
Germany’s BSI warns of rising evasion attacks on LLMs, issuing guidance to help developers and IT managers secure AI systems and mitigate related risks.
A significant and evolving threat to AI systems based on large language models (LLMs) arises from evasion attacks, malicious inputs designed to subvert or bypass model safeguards. The Federal Office for Information Security (BSI) of Germany addresses this issue in its publication Evasion Attacks on LLMs – Countermeasures in Practice, which targets developers, IT managers in companies and public authorities using pre-trained models (such as GPT) and other advanced IT users.
“This document is aimed at developers and IT managers in companies and public authorities that have opted to operate a pre-trained language model such as OpenAI ‘s GPT.” reads the announcement. “Furthermore, other experienced IT users can also benefit from the recommendations. Implementing the proposed countermeasures within the LLM system can make attacks more difficult or reduce potential damage.”
The report details LLM evasion methods like prompt injection and data manipulation, recommending secure prompts, filtering, Zero Trust, and anomaly monitoring.
Implementing these measures within LLM systems does not guarantee immunity, but it significantly raises the attack cost and helps reduce potential harm. The BSI recommends integrating both technical controls (e.g., filters, sandboxing, RAG with trusted retrieval) and organizational practices (e.g., adversarial testing, governance, training) as part of a defence-in-depth strategy.
In essence, as organisations increasingly adopt LLMs, they must assume that no single control is sufficient. Rather, they should adopt layered safeguards and continuous monitoring to address the special risks of evasion attacks, otherwise even well-configured systems can be subverted.
Evasion attacks on large language models occur during runtime, not training, where adversaries use prompt injections, jailbreaks, or adversarial inputs to bypass safeguards and alter model behavior.
The BSI report explains these threats and offers countermeasures such as secure system prompts, malicious content filtering, and requiring explicit user confirmation before execution. It also includes a practical checklist and use cases to help integrate these defenses into operational AI systems.
“The BSI publication introduces the topic of evasion attacks and presents a variety of practical countermeasures.” concludes the announcement. “A checklist facilitates both theoretical and practical implementation. Use cases demonstrate how the presented countermeasures can be integrated into a user’s own system.”
Follow me on Twitter: @securityaffairs and Facebook and Mastodon
Pierluigi Paganini
(SecurityAffairs – hacking, LLMs)
