Deep neural networks (DNNs) are vulnerable to backdoor attacks, where the adversary manipulates a small portion of training data such that the victim model predicts normally on the benign samples but classifies the triggered samples as the target class. The backdoor attack is an emerging yet threatening training-phase threat, leading to serious risks in DNN-based applications. In this paper, we revisit the trigger patterns of existing backdoor attacks. We reveal that they are either visible or not sparse and therefore are not stealthy enough. More importantly, it is not feasible to simply combine existing methods to design an effective sparse and invisible backdoor attack. To address this problem, we formulate the trigger generation as a bi-level optimization problem with sparsity and invisibility constraints and propose an effective method to solve it. The proposed method is dubbed sparse and invisible backdoor attack (SIBA). We conduct extensive experiments on benchmark datasets under different settings, which verify the effectiveness of our attack and its resistance to existing backdoor defenses. The codes for reproducing main experiments are available at \url{https://github.com/YinghuaGao/SIBA}.
翻译:深度神经网络(DNN)易受后门攻击,攻击者通过操纵一小部分训练数据,使得受害模型在良性样本上预测正常,但会将带有触发器的样本分类为目标类别。后门攻击是一种新兴且极具威胁的训练阶段攻击,给基于DNN的应用带来严重风险。本文重新审视了现有后门攻击的触发器模式,揭示其要么可见,要么不具备稀疏性,因此隐蔽性不足。更重要的是,简单地组合现有方法难以设计出有效且兼具稀疏性与隐形性的后门攻击。为解决此问题,我们将触发器生成建模为一个带有稀疏性与隐形性约束的双层优化问题,并提出一种有效方法进行求解。该方法被称为稀疏隐形后门攻击(SIBA)。我们在不同设置下对基准数据集进行了大量实验,验证了本攻击方法的有效性及其对现有后门防御的抵抗能力。主要实验的复现代码可在 \url{https://github.com/YinghuaGao/SIBA} 获取。