BentoML Vulnerability Allows Remote Code Execution on AI Servers

BentoML Vulnerability Allows Remote Code Execution on AI Servers

TL;DR: A critical deserialization vulnerability (CVSS 9.8 – CVE-2025-27520) in BentoML (v1.3.8–1.4.2) lets attackers execute remote code without authentication. Discovered by Checkmarx Zero. Upgrade to v1.4.3 immediately. WAF workaround is limited.

A critical security vulnerability has been identified in BentoML, a widely used Python framework for building and running AI-powered online services. This vulnerability, tracked as CVE-2025-27520 with a high severity score of 9.8 and discovered by cybersecurity researchers at Checkmarx Zero, could allow attackers who aren’t even logged in to take complete control of the servers running these AI services.

According to Checkmarx research shared with Hackread.com, attackers can exploit the flaw by sending crafted malicious data to a BentoML server, enabling RCE (remote code execution). This could lead to data theft or full server takeover.

The problem lies within a specific part of BentoML’s code called the deserialize_value() function, located in a file named serde.py. This function takes prepared data in a special format (called serialized data) and turns it back into a usable form for the AI service.

However, researchers found that this process does not properly check the incoming data, so an attacker can sneak in malicious instructions disguised as regular data, and BentoML unknowingly runs the attacker’s code when running this data.

Interestingly, according to Checkmarx’s report, this vulnerability is essentially a repeat of CVE-2024-2912, which was fixed in BentoML version 1.2.5., but the fix was later removed in BentoML version 1.3.8, causing the same dangerous weakness to reappear.

“Both CVEs deal with the same exact issue: an Insecure Deserialization vulnerability that can be exploited by sending an HTTP request to any valid endpoint and trigger RCE,” Checkmarx’s author Bruno Dias in a blog post.

Attackers can exploit this by creating a pickle in BentoML. In Python, Pickle is a way to save complex data structures into a binary file so they can be easily loaded later. This pickled data can contain instructions for the computer to execute. So, an attacker can create a special pickle that instructs the computer to execute harmful commands, such as opening a backdoor for a Command-and-control server connection.

While the initial security advisory from NIST suggested versions 1.3.4 through 1.4.2 were vulnerable, Checkmarx researchers found that the number of affected versions is lower, as 1.3.8 through 1.4.2 were vulnerable.

[wp_ad_camp_1

The good news is that a fix has been released in BentoML version 1.4.3. that prevents the system from processing HTTP requests. So, you should immediately update to the latest version to protect your AI services from hackers.

If upgrading is not possible, researchers suggest using a Web Application Firewall (WAF) to block incoming web traffic containing the problematic content type and serialized data. However, this might not eliminate the risk.




Source link