Sleepy Pickle Exploit Let Attackers Exploit ML Models & End-Users

Hackers are targeting, attacking, and exploiting ML models. They want to hack into these systems to steal sensitive data, interrupt services, or manipulate outcomes in their favor.

By compromising the ML models, hackers can degrade the system performance, cause financial losses, and damage the trust and reliability of AI-driven applications.

Cybersecurity analysts at Trail of Bits recently discovered that Sleepy Pickle exploit lets threat actors to exploit the ML models and attack end-users.

Technical Analysis

Researchers unveiled Sleepy Pickle, an unknown attack exploiting the insecure Pickle format for distributing machine learning models.

Unlike previous techniques compromising systems deploying models, Sleepy Pickle stealthily injects malicious code into the model during deserialization.

Free Webinar on API vulnerability scanning for OWASP API Top 10 vulnerabilities -> Book Your Spot

This allows modifying model parameters to insert backdoors or control outputs and hooking model methods to tamper with processed data by compromising end-user security, safety, and privacy.

The technique delivers a maliciously crafted pickle file containing the model and payload. When deserialized, the file executes, modifying the in-memory model before returning it to the victim.

Corrupting an ML model via a pickle file injection (Source – Trail of Bits)

Sleepy Pickle offers malicious actors a powerful foothold on ML systems by stealthily injecting payloads that dynamically tamper with models during deserialization.

This overcomes the limitations of conventional supply chain attacks by leaving no disk traces, customizing payload triggers, and broadening the attack surface to any pickle file in the target’s supply chain.

Unlike uploading covertly malicious models, Sleepy Pickle hides malice until runtime.

Attacks can modify model parameters to insert backdoors or hook methods to control inputs and outputs, enabling unknown threats like generative AI assistants providing harmful advice after weight-patching poisons the model with misinformation.

The technique’s dynamic, Leave-No-Trace nature evades static defenses.

Compromising a model to make it generate harmful outputs (Source – Trail of Bits)

The LLM models processing the sensitive data pose risks. Researchers compromised a model to steal private info during conception by injecting code recording data triggered by a secret word.

Traditional security measures were ineffective as the attack occurred within the model.

This unknown threat vector emerging from ML systems underscores their potential for abuse beyond traditional attack surfaces.

Compromising a model to steal private user data (Source – Trail of Bits)

In addition, there are other kinds of summarizer applications, such as browser apps, that improve user experience by summarizing web pages.

Since users trust these summaries, compromising the model behind them for generating harmful summaries could be a real threat and allow an attacker to serve malicious content.

Once altered summaries with malicious links are returned to users, they may click such a link and become victims of phishing scams or malware.