A newly disclosed set of security flaws in NVIDIA’s Triton Inference Server for Windows and Linux, an open-source platform for running artificial intelligence (AI) models at scale, could be exploited to take over susceptible servers.
“When chained together, these flaws can potentially allow a remote, unauthenticated attacker to gain complete control of the server, achieving remote code execution (RCE),” Wiz researchers Ronen Shustin and Nir Ohfeld said in a report published today.
The vulnerabilities are listed below –
- CVE-2025-23319 (CVSS score: 8.1) – A vulnerability in the Python backend, where an attacker could cause an out-of-bounds write by sending a request
- CVE-2025-23320 (CVSS score: 7.5) – A vulnerability in the Python backend, where an attacker could cause the shared memory limit to be exceeded by sending a very large request
- CVE-2025-23334 (CVSS score: 5.9) – A vulnerability in the Python backend, where an attacker could cause an out-of-bounds read by sending a request
Successful exploitation of the aforementioned vulnerabilities could result in information disclosure, as well as remote code execution, denial of service, data tampering in the case of CVE-2025-23319. The issues have been addressed in version 25.07.
The cloud security company said the three shortcomings could be combined together that transforms the problem from an information leak to a full system compromise without requiring any credentials.

Specifically, the problems are rooted in the Python backend that’s designed to handle inference requests for Python models from any major AI frameworks such as PyTorch and TensorFlow.
In the attack outlined by Wiz, a threat actor could exploit CVE-2025-23320 to leak the full, unique name of the backend’s internal IPC shared memory region, a key that should have remained private, and then leverage the remaining two flaws to gain full control of the inference server.
“This poses a critical risk to organizations using Triton for AI/ML, as a successful attack could lead to the theft of valuable AI models, exposure of sensitive data, manipulating the AI model’s responses, and a foothold for attackers to move deeper into a network,” the researchers said.
NVIDIA’s August bulletin for Triton Inference Server also highlights fixes for three critical bugs (CVE-2025-23310, CVE-2025-23311, and CVE-2025-23317) that, if successfully exploited, could result in remote code execution, denial of service, information disclosure, and data tampering.
While there is no evidence that any of these vulnerabilities have been exploited in the wild, users are advised to apply the latest updates for optimal protection.




