Deep learning models have shown considerable vulnerability to adversarial attacks, particularly as attacker strategies become more sophisticated. While traditional adversarial training (AT) techniques offer some resilience, they often focus on defending against a single type of attack, e.g., the $\ell_\infty$-norm attack, which can fail for other types. This paper introduces a computationally efficient multilevel $\ell_p$ defense, called the Efficient Robust Mode Connectivity (EMRC) method, which aims to enhance a deep learning model's resilience against multiple $\ell_p$-norm attacks. Similar to analytical continuation approaches used in continuous optimization, the method blends two $p$-specific adversarially optimal models, the $\ell_1$- and $\ell_\infty$-norm AT solutions, to provide good adversarial robustness for a range of $p$. We present experiments demonstrating that our approach performs better on various attacks as compared to AT-$\ell_\infty$, E-AT, and MSD, for datasets/architectures including: CIFAR-10, CIFAR-100 / PreResNet110, WideResNet, ViT-Base.
翻译:深度学习模型在面对对抗攻击时表现出显著的脆弱性,尤其是随着攻击策略日益复杂化。虽然传统的对抗训练技术提供了一定的鲁棒性,但它们通常专注于防御单一类型的攻击(例如$\ell_\infty$范数攻击),在面对其他攻击类型时可能失效。本文提出了一种计算高效的多层级$\ell_p$防御方法,称为高效鲁棒模式连通性方法,旨在增强深度学习模型对多种$\ell_p$范数攻击的抵御能力。该方法类似于连续优化中使用的解析延拓思路,通过融合两个针对特定$p$值的对抗最优模型(即$\ell_1$和$\ell_\infty$范数对抗训练解),为一系列$p$值提供良好的对抗鲁棒性。实验结果表明,在CIFAR-10、CIFAR-100数据集与PreResNet110、WideResNet、ViT-Base等架构上,相较于AT-$\ell_\infty$、E-AT和MSD等方法,我们的方法在多种攻击下均表现出更优的性能。