The Victim and The Beneficiary: Exploiting a Poisoned Model to Train a Clean Model on Poisoned Data

Recently, backdoor attacks have posed a serious security threat to the training process of deep neural networks (DNNs). The attacked model behaves normally on benign samples but outputs a specific result when the trigger is present. However, compared with the rocketing progress of backdoor attacks, existing defenses are difficult to deal with these threats effectively or require benign samples to work, which may be unavailable in real scenarios. In this paper, we find that the poisoned samples and benign samples can be distinguished with prediction entropy. This inspires us to propose a novel dual-network training framework: The Victim and The Beneficiary (V&B), which exploits a poisoned model to train a clean model without extra benign samples. Firstly, we sacrifice the Victim network to be a powerful poisoned sample detector by training on suspicious samples. Secondly, we train the Beneficiary network on the credible samples selected by the Victim to inhibit backdoor injection. Thirdly, a semi-supervised suppression strategy is adopted for erasing potential backdoors and improving model performance. Furthermore, to better inhibit missed poisoned samples, we propose a strong data augmentation method, AttentionMix, which works well with our proposed V&B framework. Extensive experiments on two widely used datasets against 6 state-of-the-art attacks demonstrate that our framework is effective in preventing backdoor injection and robust to various attacks while maintaining the performance on benign samples. Our code is available at https://github.com/Zixuan-Zhu/VaB.

翻译：近年来，后门攻击对深度神经网络（DNNs）的训练过程构成了严重的安全威胁。受攻击的模型在良性样本上表现正常，但当触发器出现时会输出特定结果。然而，与后门攻击的飞速进展相比，现有防御方法难以有效应对这些威胁，或需要依赖可能在实际场景中无法获取的良性样本。本文发现，通过预测熵可以区分中毒样本与良性样本。这一发现启发我们提出一种新颖的双网络训练框架：受害者与受益者（V&B），该框架无需额外良性样本即可利用中毒模型训练出干净模型。首先，我们通过在可疑样本上训练，使受害者网络成为强大的中毒样本检测器。其次，我们在受害者网络筛选的可信样本上训练受益者网络，以抑制后门注入。再次，采用半监督抑制策略来消除潜在后门并提升模型性能。此外，为更好地抑制漏检的中毒样本，我们提出一种强数据增强方法AttentionMix，该方法与我们提出的V&B框架协同效果显著。在两个广泛使用的数据集上针对6种前沿攻击进行的广泛实验表明，我们的框架能有效防止后门注入，对各类攻击具有鲁棒性，同时保持模型在良性样本上的性能。代码已开源：https://github.com/Zixuan-Zhu/VaB。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/