Researchers demo AI-crippling GPUHammer attack

Security researchers have successfully demonstrated the first Rowhammer attack targeting graphics processing unit (GPU) memory, revealing a critical vulnerability that could allow attackers to sabotage artificial intelligence models running on widely used Nvidiahardware.

The attack, dubbed GPUHammer, was developed by researchers Chris Lin, Joyce Qu and Gururaj Saileshwar of the University of Toronto and could pose serious risks for AI use.

By flipping individual bits in GPU memory, the researchers were able to cause AI model accuracy to plummet from 80 percent to less than 0.5 percent – this with just a single corrupted bit.

The researchers based their attack on the existing Rowhammer hardware vulnerability that exploits the physical properties of modern memory chips, and which is difficult to mitigate against.

They noted that while Rowhammer vulnerabilities have been extensively studied on CPU systems, their impact on GPU memory systems critical for machine learning applications had remained unexplored.

Technically, the researchers exploited the vulnerability by rapidly activating specific memory rows.

This was done to cause electrical interference which in turn flips bits in adjacent rows, potentially corrupting stored data.

To make the attack work, the University of Toronto team overcame tricky technical challenges to adapt Rowhammer techniques for GPU systems.

For example, graphics processors use different memory architectures than CPUs, with higher latency and faster refresh rates that typically prevent successful attacks.

To get around that problem, the researchers developed novel techniques to reverse-engineer how Nvidia GPUs map memory addresses and created parallelised attack patterns that achieve activation rates of up to 620,000 per refresh period – close to the theoretical maximum.

Their successful demonstration targeted an Nvidia A6000 GPU with 48 gigabytes of GDDR6 (graphics double data rate) memory.

The attack produced eight bit-flips across four memory banks, proving that GPU Rowhammer attacks are practical rather than merely theoretical.

Dramatic impact on AI systems

The researchers tested their attack against five AI models: AlexNet, VGG16, ResNet50, DenseNet161, and InceptionV3.

Testing on these showed that single bit-flips targeting the most important bit of neural network weight exponents caused accuracy to collapse dramatically.

In the most severe cases, models that previously achieved 80 percent accuracy on image recognition tasks saw performance drop to just 0.02 percent after a single corrupted bit.

Such a level of degradation would lead to a massive drop in AI system performance.

Researchers demo AI-crippling GPUHammer attack

Source: Lin, Qu and Saileshwar

GPUHammer attacks are particularly effective against models using 16-bit floating-point weights, a common optimisation in modern AI systems.

Flipping single bits in the exponent portion of these weights can exponentially alter their values, cascading through the entire neural network.

Furthermore, GPUHammer poses risks for cloud-based AI services, where multiple customers’ workloads often share the same GPU hardware, and their data resides in the same memory bank.

The researchers also demonstrated memory manipulation techniques that could allow attackers to precisely target victim data by exploiting how GPU memory allocators reuse freed memory blocks.

Nvidia confirms GPUHammer and looks for mitigations

GPUHammer was disclosed to graphics and AI processor vendor NVIDIA in January this year, which has confirmed the vulnerability and is investigating potential fixes.

Major cloud providers such as AWS, Microsoft Azure and Google Cloud Platform have been notified about GPUHammer as well.

Several mitigation strategies exist, though each carries trade-offs.

Enabling error-correcting code (ECC) memory can prevent single bit-flip attacks but introduces performance penalties of between three and 10 percent, and memory overhead of 6.5 percent.

The researchers found that many organisations disable ECC by default due to these overheads, which can reduce bandwidth by up to 12 percent.

GPU manufacturers could also implement modern Rowhammer defences such as Refresh Management (RFM) or Per Row Activation Counting (PRAC) in future memory generations.

Randomising virtual-to-physical memory mappings would force attackers to repeatedly profile memory layouts, which would increase attack complexity considerably.

The researchers plan to make their code publicly available following the expiry of NVIDIA’s embargo on August 12.

Source link