Transfer-based adversarial attacks raise a severe threat to real-world deep learning systems since they do not require access to target models. Adversarial training (AT), which is recognized as the strongest defense against white-box attacks, has also guaranteed high robustness to (black-box) transfer-based attacks. However, AT suffers from heavy computational overhead since it optimizes the adversarial examples during the whole training process. In this paper, we demonstrate that such heavy optimization is not necessary for AT against transfer-based attacks. Instead, a one-shot adversarial augmentation prior to training is sufficient, and we name this new defense paradigm Data-centric Robust Learning (DRL). Our experimental results show that DRL outperforms widely-used AT techniques (e.g., PGD-AT, TRADES, EAT, and FAT) in terms of black-box robustness and even surpasses the top-1 defense on RobustBench when combined with diverse data augmentations and loss regularizations. We also identify other benefits of DRL, for instance, the model generalization capability and robust fairness.
翻译:基于迁移的对抗攻击对现实世界的深度学习系统构成严重威胁,因其无需访问目标模型。对抗训练(AT)作为公认抵御白盒攻击的最强防御手段,也能保证对(黑盒)基于迁移的对抗攻击具有高鲁棒性。然而,AT由于在整个训练过程中优化对抗样本而承受巨大的计算开销。本文证明,针对基于迁移的对抗攻击,此类繁重优化并非必要——仅在训练前进行一次性的对抗数据增强就足够,我们将这种新的防御范式命名为数据中心鲁棒学习(DRL)。实验结果表明,DRL在黑盒鲁棒性方面优于广泛使用的AT技术(如PGD-AT、TRADES、EAT和FAT),甚至在与多样化数据增强和损失正则化结合时,超越了RobustBench上的顶级防御方案。我们还发现了DRL的其他优势,例如模型泛化能力和鲁棒公平性。