In response to the rapidly evolving nature of adversarial attacks on a monthly basis, numerous defenses have been proposed to generalize against as many known attacks as possible. However, designing a defense method that can generalize to all types of attacks, including unseen ones, is not realistic because the environment in which defense systems operate is dynamic and comprises various unique attacks used by many attackers. The defense system needs to upgrade itself by utilizing few-shot defense feedback and efficient memory. Therefore, we propose the first continual adversarial defense (CAD) framework that adapts to any attacks in a dynamic scenario, where various attacks emerge stage by stage. In practice, CAD is modeled under four principles: (1) continual adaptation to new attacks without catastrophic forgetting, (2) few-shot adaptation, (3) memory-efficient adaptation, and (4) high accuracy on both clean and adversarial images. We leverage cutting-edge continual learning, few-shot learning, and ensemble learning techniques to qualify the principles. Experiments conducted on CIFAR-10 and ImageNet-100 validate the effectiveness of our approach against multiple stages of 10 modern adversarial attacks and significant improvements over 10 baseline methods. In particular, CAD is capable of quickly adapting with minimal feedback and a low cost of defense failure, while maintaining good performance against old attacks. Our research sheds light on a brand-new paradigm for continual defense adaptation against dynamic and evolving attacks.
翻译:针对对抗性攻击按月快速演变的特性,现有防御方法通常致力于泛化应对尽可能多的已知攻击。然而,由于防御系统运行环境具有动态性且包含众多攻击者使用的独特攻击手段,设计能泛化至包括未知攻击在内的所有攻击类型的防御方法并不现实。防御系统需要利用少量样本防御反馈和高效记忆机制实现自我升级。为此,我们提出首个持续对抗性防御(CAD)框架,该框架能在各类攻击分阶段涌现的动态场景中适应任意攻击。实践中,CAD遵循四项原则建模:(1)在不发生灾难性遗忘的前提下持续适应新型攻击,(2)少量样本适应能力,(3)内存高效适应能力,(4)对干净图像与对抗图像均保持高准确率。我们融合持续学习、小样本学习和集成学习等前沿技术实现上述原则。在CIFAR-10和ImageNet-100数据集上的实验验证了该方法在应对10种现代对抗攻击的多阶段攻击时的有效性,并显著优于10种基线方法。特别地,CAD能以最小反馈和极低的防御失效代价快速适应新型攻击,同时保持对旧攻击的良好性能。本研究为针对动态变化的攻击实现持续防御适应开辟了全新范式。