Data-poisoning based backdoor attacks aim to insert backdoor into models by manipulating training datasets without controlling the training process of the target model. Existing attack methods mainly focus on designing triggers or fusion strategies between triggers and benign samples. However, they often randomly select samples to be poisoned, disregarding the varying importance of each poisoning sample in terms of backdoor injection. A recent selection strategy filters a fixed-size poisoning sample pool by recording forgetting events, but it fails to consider the remaining samples outside the pool from a global perspective. Moreover, computing forgetting events requires significant additional computing resources. Therefore, how to efficiently and effectively select poisoning samples from the entire dataset is an urgent problem in backdoor attacks.To address it, firstly, we introduce a poisoning mask into the regular backdoor training loss. We suppose that a backdoored model training with hard poisoning samples has a more backdoor effect on easy ones, which can be implemented by hindering the normal training process (\ie, maximizing loss \wrt mask). To further integrate it with normal training process, we then propose a learnable poisoning sample selection strategy to learn the mask together with the model parameters through a min-max optimization.Specifically, the outer loop aims to achieve the backdoor attack goal by minimizing the loss based on the selected samples, while the inner loop selects hard poisoning samples that impede this goal by maximizing the loss. After several rounds of adversarial training, we finally select effective poisoning samples with high contribution. Extensive experiments on benchmark datasets demonstrate the effectiveness and efficiency of our approach in boosting backdoor attack performance.
翻译:基于数据投毒的后门攻击旨在通过操控训练数据集,在不控制目标模型训练过程的情况下植入后门。现有攻击方法主要集中于设计触发器或触发器与良性样本的融合策略。然而,这些方法通常随机选择待中毒样本,忽视了不同中毒样本在后门注入中的重要性差异。近期一种选择策略通过记录遗忘事件来筛选固定大小的中毒样本池,但未能从全局视角考虑池外剩余样本,且计算遗忘事件需要大量额外计算资源。因此,如何高效且有效地从整个数据集中选择中毒样本是后门攻击领域的紧迫问题。为解决此问题,本文首先在后门训练损失中引入中毒掩码。我们假设使用困难中毒样本训练的后门模型对简单样本具有更强的后门效果,这可以通过阻碍正常训练过程(即最大化相对于掩码的损失)来实现。为将其与正常训练过程进一步结合,我们提出一种可学习的中毒样本选择策略,通过极小-极大优化同时学习掩码和模型参数。具体而言,外层循环通过基于所选样本最小化损失来实现后门攻击目标,内层循环则通过最大化损失选择阻碍该目标的困难中毒样本。经过多轮对抗训练后,最终筛选出具有高贡献的有效中毒样本。在基准数据集上的大量实验证明了本方法在提升后门攻击性能方面的有效性和高效性。