Recently, backdoor attacks have posed a serious security threat to the training process of deep neural networks (DNNs). The attacked model behaves normally on benign samples but outputs a specific result when the trigger is present. However, compared with the rocketing progress of backdoor attacks, existing defenses are difficult to deal with these threats effectively or require benign samples to work, which may be unavailable in real scenarios. In this paper, we find that the poisoned samples and benign samples can be distinguished with prediction entropy. This inspires us to propose a novel dual-network training framework: The Victim and The Beneficiary (V&B), which exploits a poisoned model to train a clean model without extra benign samples. Firstly, we sacrifice the Victim network to be a powerful poisoned sample detector by training on suspicious samples. Secondly, we train the Beneficiary network on the credible samples selected by the Victim to inhibit backdoor injection. Thirdly, a semi-supervised suppression strategy is adopted for erasing potential backdoors and improving model performance. Furthermore, to better inhibit missed poisoned samples, we propose a strong data augmentation method, AttentionMix, which works well with our proposed V&B framework. Extensive experiments on two widely used datasets against 6 state-of-the-art attacks demonstrate that our framework is effective in preventing backdoor injection and robust to various attacks while maintaining the performance on benign samples. Our code is available at https://github.com/Zixuan-Zhu/VaB.
翻译:近年来,后门攻击对深度神经网络(DNNs)的训练过程构成了严重的安全威胁。受攻击的模型在良性样本上表现正常,但当触发器出现时会输出特定结果。然而,与后门攻击的飞速进展相比,现有防御方法难以有效应对这些威胁,或需要依赖可能在实际场景中无法获取的良性样本。本文发现,通过预测熵可以区分中毒样本与良性样本。这一发现启发我们提出一种新颖的双网络训练框架:受害者与受益者(V&B),该框架无需额外良性样本即可利用中毒模型训练出干净模型。首先,我们通过在可疑样本上训练,使受害者网络成为强大的中毒样本检测器。其次,我们在受害者网络筛选的可信样本上训练受益者网络,以抑制后门注入。再次,采用半监督抑制策略来消除潜在后门并提升模型性能。此外,为更好地抑制漏检的中毒样本,我们提出一种强数据增强方法AttentionMix,该方法与我们提出的V&B框架协同效果显著。在两个广泛使用的数据集上针对6种前沿攻击进行的广泛实验表明,我们的框架能有效防止后门注入,对各类攻击具有鲁棒性,同时保持模型在良性样本上的性能。代码已开源:https://github.com/Zixuan-Zhu/VaB。