KEY FINDINGS
- AI chatbots are vulnerable to security flaws that can be exploited by threat actors to jailbreak them.
- Jailbreaking AI chatbots can allow cybercriminals to create uncensored content, generate phishing emails, and carry out other malicious activities.
- Cybercriminals are increasingly using AI chatbots to evade security measures and carry out their attacks.
- AI security is still in its early stages, but researchers are working to develop effective strategies to protect AI chatbots from exploitation.
- Users can protect themselves from AI chatbot jailbreaking by being aware of the risks and taking steps to mitigate them, such as using only trusted chatbots and being careful about the information they share with them.
We have been witnessing the rise and rise of AI-powered chatbots. Advanced AI-based language tools like ChatGPT have improved our conversational capabilities tremendously by offering vast datasets and generating a coherent response against every query.
However, the latest research reveals that these chatbots are prone to security flaws, which threat actors can exploit to jailbreak them and evade all ethical guidelines and security mechanisms implemented by organizations.
Research conducted by WormGPT fame cybersecurity firm SlashNext and shared with Hackread.com on Tuesday, 12 September 2023, potential dangers of vulnerable chatbots were revealed. Researchers noted that jailbreaking chatbots can allow cybercriminals to create uncensored content and have devastating negative consequences.
Threat Actors Exploiting AI Chatbots
SlashNext researchers claim to have discovered multiple actors bragging on discussion forums about successful jailbreaks on AI chatbots and writing phishing emails. In addition, various threat actors are selling AI bots that appear to be custom LLMs (large language models).
However, these aren’t custom LLMs but jailbroken versions of widely used chatbots like ChatGPT and other AI tools. Using these tools instead of exploiting a chatbot is more advantageous for cybercriminals because their identities remain undisclosed so, attackers can keep exploiting AI-generated content for fulfilling their nefarious objectives.
How it all Started?
AI manipulating tools have been surfacing one after another lately, allowing cybercriminals to pull off phishing and other scams convincingly. This trend gained momentum with tools like WormGPT, discovered in July 2023. Another subscription-based tool, FraudGPT, surfaced on the Dark Web in July 2023. So far, many different variants have emerged, such as EscapeGPT, BadGPT, DarkGPT, and Black Hat GPT, forcing the cybersecurity community to take notice of the impending havoc.
Still, researchers agree that most exploited tools don’t use custom LLM, except for WormGPT. Instead, these use interfaces that connect to public chatbots’ jailbroken versions hidden through a wrapper. This means cybercriminals can exploit jailbroken versions of language models like OpenGPT and present them as custom LLMs. Some tools offer unauthenticated access if users can make payments in cryptocurrency. This allows users to exploit AI-generated content easily while maintaining anonymity.
What is AI Chatbot Jailbreaking?
Chatbot jailbreaking refers to exploiting the inherent weaknesses in the chatbot’s prompting mechanism and forcing the AI to ignore the built-in safety measures and guidelines whenever a user issues a command to trigger unrestricted code.
Hence, the chatbot will generate an output that doesn’t comply with its built-in restrictions/censorship. Jailbreak prompts are there to unlock the chatbot’s full potential. For instance, the Anarchy method uses a commanding tone to enable an unrestricted mode in ChatGPT and many other AI chatbots.
What to Do About AI Chatbot Jailbreaking?
SlashNext noted that cybercriminals are continually working to evade chatbots’ safety features. This looming threat can only be mitigated with “responsible innovation and enhancing safeguards.”
“AI security is still in its early stages as researchers explore effective strategies to fortify chatbots against those seeking to exploit them. The goal is to develop chatbots that can resist attempts to compromise their safety while continuing to provide valuable services to users,” researchers concluded.
RELATED ARTICLES
- China’s Baidu Introduces ChatGPT Rival Ernie Bot
- Microsoft patent reveals chatbot to talk to dead people
- ChatGPT Clone Apps Collecting Personal Data on iOS, Play Store
- Malicious ChatGPT & Google Bard Installers Distribute RedLine Stealer
- Facebook Phishing Scam: Crooks Using Messenger Chatbots to Steal Data