As artificial intelligence infrastructure rapidly expands, critical security flaws threaten the backbone of enterprise AI deployments.
Security researchers at Oligo Security have uncovered a series of dangerous Remote Code Execution (RCE) vulnerabilities affecting major AI frameworks from Meta, NVIDIA, Microsoft, and PyTorch projects, including vLLM and SGLang.
The vulnerabilities, collectively termed “ShadowMQ,” stem from the unsafe implementation of ZeroMQ (ZMQ) communications combined with Python’s pickle deserialization.
What makes this threat particularly alarming is how it spread across the AI ecosystem through code reuse and copy-paste development practices.
How the Vulnerability Spread Across Frameworks
The investigation began in 2024 when researchers analyzed Meta’s Llama Stack and discovered the dangerous use of ZMQ’s recv_pyobj() method, which deserializes data using Python’s pickle module.
ShadowMQ Vulnerability CVE Data Table
| CVE ID | Product | Severity | CVSS Score | Vulnerability Type |
|---|---|---|---|---|
| CVE-2024-50050 | Meta Llama Stack | Critical | 9.8 | Remote Code Execution |
| CVE-2025-30165 | vLLM | Critical | 9.8 | Remote Code Execution |
| CVE-2025-23254 | NVIDIA TensorRT-LLM | Critical | 9.3 | Remote Code Execution |
| CVE-2025-60455 | Modular Max Server | Critical | 9.8 | Remote Code Execution |
| N/A (Unpatched) | Microsoft Sarathi-Serve | Critical | 9.8 | Remote Code Execution |
| N/A (Incomplete Fix) | SGLang | Critical | 9.8 | Remote Code Execution |
This configuration created unauthenticated network sockets that could execute arbitrary code during deserialization, enabling remote attackers to compromise systems.
After Meta patched the vulnerability (CVE-2024-50050), Oligo researchers found identical security flaws across multiple frameworks.
NVIDIA’s TensorRT-LLM, PyTorch projects vLLM and SGLang, and Modular’s Max Server all contained nearly identical vulnerable patterns.
Oligo Code analysis revealed that entire files were copied between projects, spreading the security flaw like a virus. These AI inference servers power critical enterprise infrastructure, processing sensitive data across GPU clusters.
Organizations trusting SGLang include xAI, AMD, NVIDIA, Intel, LinkedIn, Oracle Cloud, Google Cloud, Microsoft Azure, AWS, MIT, Stanford, UC Berkeley, and numerous other major technology companies.
Successful exploitation could allow attackers to execute arbitrary code, escalate privileges, exfiltrate model data, or install cryptocurrency miners.
Oligo researchers identified thousands of exposed ZMQ sockets communicating unencrypted over the public internet. However, Microsoft’s Sarathi-Serve and SGLang remain vulnerable with incomplete fixes.
Organizations should immediately update to patched versions, avoid using pickle with untrusted data, implement authentication for ZMQ communications, and restrict network access to ZMQ endpoints.
Follow us on Google News, LinkedIn, and X for daily cybersecurity updates. Contact us to feature your stories.
