Deep neural networks (DNNs) are vulnerable to backdoor attacks, where the adversary manipulates a small portion of training data such that the victim model predicts normally on the benign samples but classifies the triggered samples as the target class. The backdoor attack is an emerging yet threatening training-phase threat, leading to serious risks in DNN-based applications. In this paper, we revisit the trigger patterns of existing backdoor attacks. We reveal that they are either visible or not sparse and therefore are not stealthy enough. More importantly, it is not feasible to simply combine existing methods to design an effective sparse and invisible backdoor attack. To address this problem, we formulate the trigger generation as a bi-level optimization problem with sparsity and invisibility constraints and propose an effective method to solve it. The proposed method is dubbed sparse and invisible backdoor attack (SIBA). We conduct extensive experiments on benchmark datasets under different settings, which verify the effectiveness of our attack and its resistance to existing backdoor defenses. The codes for reproducing main experiments are available at \url{https://github.com/YinghuaGao/SIBA}.
翻译:深度神经网络(DNNs)容易受到后门攻击的威胁,攻击者通过操纵少量训练数据,使受害模型在良性样本上正常预测,但将带有触发器的样本分类为目标类别。后门攻击是一种新兴且具有威胁性的训练阶段攻击,给基于DNN的应用带来严重风险。本文重新审视了现有后门攻击的触发器模式,发现它们要么可见、要么不够稀疏,因此隐蔽性不足。更重要的是,简单组合现有方法无法设计出有效的稀疏且不可见后门攻击。为解决此问题,我们将触发器生成建模为具有稀疏性和不可见性约束的双层优化问题,并提出一种有效方法进行求解。该方法被命名为稀疏不可见后门攻击(SIBA)。我们在不同设置下的基准数据集上进行了广泛实验,验证了本攻击的有效性及其对现有后门防御的抵抗能力。复现主要实验的代码已发布于 \url{https://github.com/YinghuaGao/SIBA}。