Backdoor attack has emerged as a novel and concerning threat to AI security. These attacks involve the training of Deep Neural Network (DNN) on datasets that contain hidden trigger patterns. Although the poisoned model behaves normally on benign samples, it exhibits abnormal behavior on samples containing the trigger pattern. However, most existing backdoor attacks suffer from two significant drawbacks: their trigger patterns are visible and easy to detect by backdoor defense or even human inspection, and their injection process results in the loss of natural sample features and trigger patterns, thereby reducing the attack success rate and model accuracy. In this paper, we propose a novel backdoor attack named SATBA that overcomes these limitations using spatial attention and an U-net based model. The attack process begins by using spatial attention to extract meaningful data features and generate trigger patterns associated with clean images. Then, an U-shaped model is used to embed these trigger patterns into the original data without causing noticeable feature loss. We evaluate our attack on three prominent image classification DNN across three standard datasets. The results demonstrate that SATBA achieves high attack success rate while maintaining robustness against backdoor defenses. Furthermore, we conduct extensive image similarity experiments to emphasize the stealthiness of our attack strategy. Overall, SATBA presents a promising approach to backdoor attack, addressing the shortcomings of previous methods and showcasing its effectiveness in evading detection and maintaining high attack success rate.
翻译:后门攻击已成为人工智能安全领域一种新型且令人担忧的威胁。这类攻击通过在包含隐藏触发模式的数据集上训练深度神经网络(DNN)来实现。尽管被植入后门的模型在良性样本上表现正常,但在包含触发模式的样本上会表现出异常行为。然而,现有的大多数后门攻击存在两个显著缺陷:其触发模式可见,容易被后门防御甚至人工检测发现;其注入过程导致自然样本特征和触发模式丢失,从而降低了攻击成功率和模型准确率。本文提出一种名为SATBA的新型后门攻击,通过使用空间注意力和基于U-net的模型克服了这些局限性。攻击过程首先利用空间注意力提取有意义的数 据特征,并生成与干净图像关联的触发模式;随后使用U形模型将这些触发模式嵌入原始数据,且不会造成明显的特征损失。我们在三个主流图像分类DNN和三个标准数据集上评估了所提攻击。结果表明,SATBA在保持对后门防御鲁棒性的同时,实现了高攻击成功率。此外,我们进行了广泛的图像相似性实验,以强调攻击策略的隐蔽性。总体而言,SATBA提出了一种有前景的后门攻击方法,弥补了先前方法的不足,展示了其规避检测和维持高攻击成功率方面的有效性。