Security researchers have developed a sophisticated prompt injection attack that abuses trusted AI summarisation tools, and potentially turns these into ClickFix-style step-by-step instructions to compromise user systems.
The technique, detailed in new research from Singapore security vendor CloudSEK, exploits the gap between what humans can see on a webpage and what artificial intelligence models process when generating summaries.
ClickFix attacks exploit human behaviour by providing malicious instructions that people willingly follow, to sort out a problem or error.
Often, this involves copying and pasting commands into the Windows Run dialog, terminal windows, modifying system security settings, downloading and executing file, installing malicious browser extensions and more.
Microsoft Threat Intelligence has outlined a number of techniques used to deliver malware such as the LummaStealer infostealer, including abusing the CAPTCHA test to discern if site visitors are humans or bots.
CloudSEK now suggests that attackers can embed malicious instructions within hypertext markup language (HTML) content used in, for example, email content and web pages, using cascading style sheet (CSS) properties that renders text completely invisible to human readers.
These techniques include setting text opacity to zero, using white text on white backgrounds, microscopic font sizes, and positioning content off-screen.
While users see benign content, AI summarisers process both the visible text and the hidden malicious instructions, CloudSEK researcher Dharani Sanjaiy wrote.
The attack scenario employs what researchers term “prompt overdose” which repeats the malicious payload dozens of times throughout the hidden sections to dominate and ensure the AI model’s attention.
When a victim uses an AI summariser on such content, the tool outputs clean, authoritative-sounding instructions for downloading and executing ransomware.
CloudSEK said its proof-of-concept successfully manipulated multiple commercial summarisation tools, including browser extensions and web-based services.
Testing the proof-of-concept, the researchers embedded harmless Base64-encoded PowerShell commands within invisible HTML containers styled with “opacity: 0” and “font-size: 0” properties.
When processed by AI summarisers, these hidden instructions consistently appeared in the generated summaries, often excluding the legitimate visible content entirely.
CloudSEK’s proof-of-concept wasn’t always effective, as the summariser would at times mix the ClickFix payload with the visible text.
The researchers recommend several defensive measures for organisations deploying AI summarisation tools, focused on pre-processing content.
Content sanitisation systems should strip CSS attributes like zero opacity and invisible text before processing by AI models.
Prompt filtering mechanisms can detect embedded instructions designed to manipulate language model behaviour, including meta-commands and suspicious repetition patterns.
Security teams should implement payload pattern recognition to identify common ransomware delivery commands, even when obfuscated through encoding.
AI platforms can also introduce token-level balancing to reduce the effectiveness of prompt overdose attacks by weighting repeated content less heavily, CloudSEK suggested.
Source link