AI Jailbreak Concerns Heat Up As Japanese Man Builds GenAI Ransomware


A 25-year-old man from Kawasaki, Japan was arrested this week for allegedly using generative AI tools to create ransomware in an AI jailbreaking case that may be the first of its kind in Japan.

The arrest of Ryuki Hayashi, widely reported in Japan, is the latest example of an attacker defeating AI guardrails, which has become something of an obsession for hackers and cybersecurity researchers alike.

Just this week, researchers from Germany’s CISPA Helmholtz Center for Information Security reported on their efforts to jailbreak GPT-4o, the latest multimodal large language model (MLLM) released by OpenAI a little more than two weeks ago. Concerns raised by those researchers and others led OpenAI to establish a safety and security committee this week to try to address AI risks.

AI Jailbreak Tools and Methods Unclear

News reports on Hayashi’s arrest have been lacking in details on the tools and methods he used to create the ransomware.

The Japan Times reported that Hayashi, a former factory worker, “is not an expert on malware. He allegedly learned online how to ask AI tools questions that would elicit information on how to create malware.”

Hayashi came under suspicion after police arrested him in March “for allegedly using fake identification to obtain a SIM card registered under someone else’s name,” the paper reported.

The Japan News, which reported that Hayashi is unemployed, said police found “a homemade virus on a computer” following the March arrest.

The News said police suspect he “used his home computer and smartphone to combine information about creating malware programs obtained after giving instructions to several generative AI systems in March last year.”

Hayashi “allegedly gave instructions to the AI systems while concealing his purpose of creating the virus to obtain design information necessary for encrypting files and demanding ransom,” the News reported. “He is said to have searched online for ways to illegally obtain information.”

Hayashi reportedly admitted to charges during questioning, and told police, “I wanted to make money through ransomware. I thought I could do anything if I asked AI.”

There have been no reports of damage from the ransomware he created, the News said.

LLM Jailbreak Research Heats Up

The news comes as research on AI jailbreaking and attack techniques has grown, with a number of recent reports on risks and possible solutions.

In a paper posted to arXiv this week, the CISPA researchers said they were able to more than double their attack success rate (ASR) on GPT-4o’s voice mode with an attack they dubbed VOICEJAILBREAK, “a novel voice jailbreak attack that humanizes GPT-4o and attempts to persuade it through fictional storytelling (setting, character, and plot).”

Another arXiv paper, posted in February by researchers at the University of California at Berkeley, looked at a range of risks associated with GenAI tools such as Microsoft Copilot and ChatGPT, along with possible solutions, such as development of an “AI firewall” to monitor and change LLM inputs and outputs if necessary.

And earlier this month, OT and IoT security company SCADAfence outlined a wide range of AI tools, threat actors and attack techniques. In addition to general use case chatbots like ChatGPT and Google Gemini, the report looked at “dark LLMs” created for malicious purposes, such as WormGPT, FraudGPT, DarkBERT and DarkBART.

SCADAfence recommended that OT and SCADA organizations follow best practices such as limiting network exposure for control systems, patching, access control and up to date offline backups.

GenAI uses and misuses is also expected to be the topic of a number of presentations at Gartner’s Security and Risk Management Summit next week in National Harbor, Maryland, just outside the U.S. capital.



Source link