Incomplete Patch Leaves NVIDIA and Docker Users at Risk
Trend Micro found major flaws in the NVIDIA Container Toolkit and Docker, risking container escapes, DoS attacks and AI infrastructure. Users should audit setups and apply fixes.
Trend Micro Research has recently exposed a critical security vulnerability affecting the NVIDIA Container Toolkit and Docker and threatening systems utilizing these technologies.
The research, shared with Hackread.com, indicates that this issue is caused by a previously issued security update by NVIDIA in September 2024, intended to address a vulnerability identified as CVE-2024-0132 within the NVIDIA Container Toolkit, which was incomplete. This oversight leaves systems susceptible to probable container escape attacks.
Trend Micro’s findings reveal that the incomplete patch for CVE-2024-0132 leaves a time-of-check time-of-use (TOCTOU) vulnerability within the NVIDIA Container Toolkit. This vulnerability allows a maliciously crafted container to gain access to the host file system. While earlier versions of the toolkit are affected, version 1.17.4 remains vulnerable if the “allow-cuda-compat-libs-from-container” feature is explicitly enabled.
In addition to this, researchers revealed a denial-of-service (DoS) vulnerability impacting Docker on Linux systems. This issue, which has also been independently reported by Moby and NVIDIA, stems from the way Docker handles multiple mounts configured with (bind-propagation=shared).
When a Docker container stops, its file system connections should be removed, but a bug prevents this, causing the “mount table” (which tracks these connections) to grow rapidly. This excessive growth consumes all available file descriptors, which are needed to manage connections, and this prevents Docker from starting new containers and can lead to system performance issues, even disconnecting users.
Trend Micro explains it with an attack scenario where an attacker can create malicious container images connected via a volume symlink and run them on a victim’s platform. They can now gain access to the host file system and Container Runtime Unix sockets, executing arbitrary commands with root privileges and granting them full remote control.
The consequences of these vulnerabilities could be severe. As the report states, successful attacks could lead to “unauthorized access to sensitive host data, theft of proprietary AI models,” and “severe operational disruptions.”
Companies using NVIDIA and Docker in areas like AI and cloud computing are most at risk. This is especially true for those using default settings or newer features. Trend Micro recommends several steps to protect against these vulnerabilities. These include limiting access to Docker, disabling unnecessary software features, and carefully checking software images. The report also advises companies to “regularly audit container-to-host interactions.”
Thomas Richards, Infrastructure Security Practice Director at Black Duck, a Burlington, Massachusetts-based provider of application security solutions, commented on the latest development, warning companies to install patches immediately.
“The severity of these vulnerabilities should prompt organizations to take immediate action to patch their systems and better manage software risk. Given how NVIDIA has become the de facto standard for AI processing, this potentially affects every organization involved in the AI space.“ Thomas warned.
“With working proof of concept code for some of the issues, organizations are already at risk. Data corruption or system downtime can negatively impact the LLM models and create supply chain concerns if the models are corrupted for downstream applications.“