Malicious AI Models Are Behind a New Wave of Cybercrime, Cisco Talos
New research from Cisco Talos reveals a rise in cybercriminals abusing Large Language Models (LLMs) to enhance their illicit activities. These powerful AI tools, known for generating text, solving problems, and writing code, are, reportedly, being manipulated to launch more sophisticated and widespread attacks.
For your information, LLMs are designed with built-in safety features, including alignment (training to minimize bias) and guardrails (real-time mechanisms to prevent harmful outputs). For instance, a legitimate LLM like ChatGPT would refuse to generate a phishing email. However, cybercriminals are actively seeking ways around these protections.
Talos’s investigation, shared with Hackread.com highlights three primary methods used by adversaries:
Uncensored LLMs: These models, lacking safety constraints, readily produce sensitive or harmful content. Examples include OnionGPT and WhiteRabbitNeo, which can generate offensive security tools or phishing emails. Frameworks like Ollama allow users to run uncensored models, such as Llama 2 Uncensored, on their own machines.
Custom-Built Criminal LLMs: Some enterprising cybercriminals are developing their own LLMs specifically designed for malicious purposes. Names like GhostGPT, WormGPT, DarkGPT, DarkestGPT, and FraudGPT are advertised on the dark web, boasting features like creating malware, phishing pages, and hacking tools.
Jailbreaking Legitimate LLMs: This involves tricking existing LLMs into ignoring their safety protocols through clever prompt injection techniques. Methods observed include using encoded language (like Base64), appending random text (adversarial suffixes), role-playing scenarios (e.g., DAN or Grandma jailbreak), and even exploiting the model’s self-awareness (meta prompting).
The dark web has become a marketplace for these malicious LLMs. FraudGPT, for example, advertised features ranging from writing malicious code and creating undetectable malware to finding vulnerable websites and generating phishing content.
However, the market isn’t without its risks for criminals themselves; Talos researchers found that the alleged developer of FraudGPT, CanadianKingpin12, was scamming potential buyers out of cryptocurrency by promising a non-existent product.
Beyond direct illicit content generation, cybercriminals are leveraging LLMs for tasks similar to legitimate users, but with a malicious twist. In December 2024, Anthropic, developers of Claude LLM, noted programming, content creation, and research as top uses for their model. Similarly, criminal LLMs are used for:
- Programming: Crafting ransomware, remote access Trojans, wipers, and code obfuscation.
- Content Creation: Generating convincing phishing emails, landing pages, and configuration files.
- Research: Verifying stolen credit card numbers, scanning for vulnerabilities, and even brainstorming new criminal schemes.
LLMs are also becoming targets themselves. Attackers are distributing backdoored models on platforms like Hugging Face, embedding malicious code that runs when downloaded. Furthermore, LLMs that use external data sources (Retrieval Augmented Generation or RAG) can be vulnerable to data poisoning, where attackers manipulate the data to influence the LLM’s responses.
Cisco Talos anticipates that as AI technology continues to advance, cybercriminals will increasingly adopt LLMs to streamline their operations, effectively acting as a “force multiplier” for existing attack methods rather than creating entirely new “cyber weapons.”