Due to the popularity of Artificial Intelligence (AI) technology, numerous backdoor attacks are designed by adversaries to mislead deep neural network predictions by manipulating training samples and training processes. Although backdoor attacks are effective in various real scenarios, they still suffer from the problems of both low fidelity of poisoned samples and non-negligible transfer in latent space, which make them easily detectable by existing backdoor detection algorithms. To overcome the weakness, this paper proposes a novel frequency-based backdoor attack method named WaveAttack, which obtains image high-frequency features through Discrete Wavelet Transform (DWT) to generate backdoor triggers. Furthermore, we introduce an asymmetric frequency obfuscation method, which can add an adaptive residual in the training and inference stage to improve the impact of triggers and further enhance the effectiveness of WaveAttack. Comprehensive experimental results show that WaveAttack not only achieves higher stealthiness and effectiveness, but also outperforms state-of-the-art (SOTA) backdoor attack methods in the fidelity of images by up to 28.27\% improvement in PSNR, 1.61\% improvement in SSIM, and 70.59\% reduction in IS.
翻译:由于人工智能技术的普及,攻击者通过操纵训练样本和训练过程设计出大量后门攻击,以误导深度神经网络的预测。尽管后门攻击在多种实际场景中有效,但仍面临中毒样本保真度低和潜在空间中不可忽略的迁移问题,使其容易被现有后门检测算法识别。为克服这些弱点,本文提出一种新颖的基于频率的后门攻击方法——WaveAttack,通过离散小波变换(DWT)获取图像高频特征来生成后门触发器。此外,我们引入非对称频率混淆方法,在训练和推理阶段添加自适应残差以增强触发器的影响,进一步提升WaveAttack的有效性。综合实验结果表明,WaveAttack不仅实现了更高的隐蔽性和有效性,还在图像保真度上优于最先进的(SOTA)后门攻击方法,PSNR提升高达28.27%,SSIM提升1.61%,IS降低70.59%。