Chaining NVIDIA’s Triton Server flaws exposes AI systems to remote takeover

Chaining NVIDIA's Triton Server flaws exposes AI systems to remote takeover

Chaining NVIDIA’s Triton Server flaws exposes AI systems to remote takeover

Pierluigi Paganini
Chaining NVIDIA's Triton Server flaws exposes AI systems to remote takeover August 05, 2025

Chaining NVIDIA's Triton Server flaws exposes AI systems to remote takeover

New flaws in NVIDIA’s Triton Server let remote attackers take over systems via RCE, posing major risks to AI infrastructure.

Newly revealed security flaws in NVIDIA’s Triton Inference Server for Windows and Linux could let remote, unauthenticated attackers fully take over vulnerable servers. According to Wiz Research team, chaining these vulnerabilities enables remote code execution (RCE), posing a serious threat to AI infrastructure.

Triton Inference Server is an open source inference serving software that streamlines AI inferencing. Triton Inference Server enables teams to deploy any AI model from multiple deep learning and machine learning frameworks, including TensorRT, TensorFlow, PyTorch, ONNX, OpenVINO, Python, RAPIDS FIL, and more. 

“The Wiz Research team has discovered a chain of critical vulnerabilities in NVIDIA’s Triton Inference Server, a popular open-source platform for running AI models at scale.” read the report published by Wiz. “When chained together, these flaws can potentially allow a remote, unauthenticated attacker to gain complete control of the server, achieving remote code execution (RCE).”

The attack begins in Triton’s Python backend with a small info leak that escalates to full system compromise, threatening AI models, data, and network security. Researchers disclosed the issues to NVIDIA, who quickly addressed them. The flaws, tracked as CVE-2025-23319, CVE-2025-23320, and CVE-2025-23334, highlight the urgency for all Triton Inference Server users to update immediately.

The exploitation of the three flaws can lead to code execution, denial of service, data tampering, and information disclosure. An attacker can chain them to fully compromise a server.

Below is the description for these vulnerabilities:

  • CVE-2025-23319 (CVSS score: 8.1) – A vulnerability in the Python backend, where an attacker could cause an out-of-bounds write by sending a request. A successful exploit of this vulnerability might lead to remote code execution, denial of service, data tampering, or information disclosure.
  • CVE-2025-23320 (CVSS score: 7.5) – A vulnerability in the Python backend, where an attacker could cause the shared memory limit to be exceeded by sending a very large request. A successful exploit of this vulnerability might lead to information disclosure.
  • CVE-2025-23334 (CVSS score: 5.9) – A vulnerability in the Python backend, where an attacker could cause an out-of-bounds read by sending a request. A successful exploit of this vulnerability might lead to information disclosure.

The vulnerabilities have been addressed in version 25.07.

The researchers pointed out that taking over an NVIDIA Triton Inference Server can lead to serious consequences such as theft of proprietary AI models, exposure of sensitive data, manipulation of AI outputs, and using the compromised server to infiltrate deeper into the organization’s network.

“A verbose error message in a single component, a feature that can be misused in the main server were all it took to create a path to potential system compromise. As companies deploy AI and ML more widely, securing the underlying infrastructure is paramount. This discovery highlights the importance of defense-in-depth, where security is considered at every layer of an application.” concludes the report.

The company is not aware of attacks in the wild exploiting these vulnerabilities.

Follow me on Twitter: @securityaffairs and Facebook and Mastodon

Pierluigi Paganini

(SecurityAffairs – hacking, NVIDIA’s Triton Server)






Source link