New Encoding Technique Jailbreaks ChatGPT-4o To Write Exploit Codes


A novel encoding method enables ChatGPT-4o and various other well-known AI models to override their internal protections, facilitating the creation of exploit code.

Marco Figueroa has uncovered this encoding technique, which allows ChatGPT-4o and other popular AI models to bypass their built-in safeguards and generate exploit code.

SIEM as a Service

This revelation exposes a significant vulnerability in AI security measures, raising important questions about the future of AI safety.

The jailbreak tactic exploits a linguistic loophole by instructing the model to process a seemingly benign task: hex conversion.

Since ChatGPT-4o is optimized to follow instructions in natural language, it does not inherently recognize that converting hex values might produce harmful outputs.

This vulnerability arises because the model is designed to follow instructions step-by-step but lacks deep context awareness to evaluate the safety of each step.

Upgrade Your Cybersecurity Skills With 100+ Premium Cyber Security Courses Online - Enroll Here

By encoding malicious instructions in hexadecimal format, attackers can circumvent ChatGPT-4o’s security guardrails. The model decodes the hex string without recognizing the harmful intent, thus bypassing its content moderation systems.

Jailbreak steps

This compartmentalized execution of tasks allows attackers to exploit the model’s efficiency in following instructions without a deeper analysis of the overall outcome.

The discovery highlights the need for enhanced AI safety features, including early decoding of encoded content, improved context-awareness, and more robust filtering mechanisms to detect patterns indicative of exploit generation or vulnerability research.

As AI evolves and becomes more sophisticated, attackers will find new ways to benefit from these technologies and accelerate the development of threats capable of bypassing AI-based endpoint protection solutions.

Leveraging AI isn’t required to bypass today’s endpoint security solutions, as tactics and techniques to evade detection by EDRs and EPPs are well documented, specifically in memory manipulations and fileless malware.

However, advances in AI-based technologies can lower entry barriers to sophisticated threats by automating the creation of polymorphic and evasive malware.

This discovery follows a recent advisory by Vulcan Cyber’s Voyager18 research team, which described a new cyber-attack technique using ChatGPT to spread malicious packages in developers’ environments.

By leveraging ChatGPT’s code generation capabilities, attackers can potentially exploit fabricated code libraries to distribute malicious packages, bypassing conventional methods.

As AI language models continue to advance, organizations must stay vigilant and keep up with the latest developments in AI-based attacks to protect themselves from these emerging threats.

The ability to bypass security measures using encoded instructions is a significant threat vector that needs to be addressed as AI continues to evolve in capability.

Run private, Real-time Malware Analysis in both Windows & Linux VMs. Get a 14-day free trial with ANY.RUN!



Source link