Noise Attack, A New Backdoor Exploiting Power Spectral Density for Evasion


Researchers propose a novel backdoor attack method called NoiseAttack. Unlike existing methods that typically target a single class, NoiseAttack can target multiple classes with minimal input configuration by using White Gaussian Noise with various Power Spectral Densities as triggers and a unique training strategy to execute the attack. 

Experimental results show that NoiseAttack achieves a high attack success rate against popular network architectures and datasets and can bypass state-of-the-art backdoor detection methods.

EHA

The research proposes NoiseAttack, a novel backdoor attack for image classification that leverages the power spectral density (PSD) of White Gaussian Noise (WGN) as a trigger embedded during training. 

Are You From SOC/DFIR Teams? - Try Advanced Malware and Phishing Analysis With ANY.RUN - 14 day free trial

An overview of the proposed NoiseAttack
An overview of the proposed NoiseAttack

This WGN is imperceptible and universally applied but only activated on specific samples to misclassify them into multiple target labels. NoiseAttack is effective against state-of-the-art defenses and achieves high attack success rates on various datasets and models. 

It leverages White Gaussian Noise as a trigger to create a sample-specific, multi-targeted backdoor attack, which allows for flexible control over target labels and is designed to maintain the model’s performance on clean inputs while misclassifying the victim class with the applied trigger. 

The attack involves training a backdoored model on a poisoned dataset constructed with carefully crafted noise levels and associated target labels, which ensures that the model is vulnerable to the trigger, leading to the desired misclassification. 

It offers a versatile approach for creating backdoors with multiple target labels, providing a powerful tool for adversaries seeking to compromise machine learning models.

An overview of the poisoned dataset preparation
An overview of the poisoned dataset preparation

The framework effectively evades state-of-the-art defenses and achieves high attack success rates across various datasets and models. By introducing white Gaussian noise to input images, NoiseAttack can successfully misclassify them into targeted labels without significantly affecting the model’s performance on clean data. 

The attack’s robustness to defense mechanisms such as GradCam, Neural Cleanse, and STRIP highlights its potential as a significant threat to the security of deep neural networks. Additionally, NoiseAttack’s ability to perform multi-targeted attacks demonstrates its versatility and adaptability to different scenarios.

Trigger Reconstruction Using Neural Cleanse
Trigger Reconstruction Using Neural Cleanse

The paper presents a novel backdoor attack method that uses the power spectral density of White Gaussian Noise as a trigger, which is highly effective and can target multiple classes simultaneously. 

Through theoretical analysis and extensive experiments, the authors demonstrate the feasibility and ubiquity of this attack. The NoiseAttack achieves high average attack success rates across various datasets and models without significantly impacting the accuracy for non-victim classes. 

The attack is also shown to be evasive and robust, bypassing existing detection and defense techniques, which introduces a new paradigm for backdoor attacks and highlights the need for further research in defense mechanisms.

What Does MITRE ATT&CK Expose About Your Enterprise Security? - Watch Free Webinar!



Source link