Skip connection is an essential ingredient for modern deep models to be deeper and more powerful. Despite their huge success in normal scenarios (state-of-the-art classification performance on natural examples), we investigate and identify an interesting property of skip connections under adversarial scenarios, namely, the use of skip connections allows easier generation of highly transferable adversarial examples. Specifically, in ResNet-like models (with skip connections), we find that biasing backpropagation to favor gradients from skip connections--while suppressing those from residual modules via a decay factor--allows one to craft adversarial examples with high transferability. Based on this insight, we propose the Skip Gradient Method (SGM). Although starting from ResNet-like models in vision domains, we further extend SGM to more advanced architectures, including Vision Transformers (ViTs), models with varying-length paths, and other domains such as natural language processing. We conduct comprehensive transfer-based attacks against diverse model families, including ResNets, Transformers, Inceptions, Neural Architecture Search-based models, and Large Language Models (LLMs). The results demonstrate that employing SGM can greatly improve the transferability of crafted attacks in almost all cases. Furthermore, we demonstrate that SGM can still be effective under more challenging settings such as ensemble-based attacks, targeted attacks, and against defense equipped models. At last, we provide theoretical explanations and empirical insights on how SGM works. Our findings not only motivate new adversarial research into the architectural characteristics of models but also open up further challenges for secure model architecture design. Our code is available at https://github.com/mo666666/SGM.
翻译:跳跃连接是现代深度模型实现更深层、更强大性能的关键要素。尽管其在常规场景下(对自然样本的分类性能达到最先进水平)取得了巨大成功,本研究探究并识别了跳跃连接在对抗场景下的一个有趣特性,即使用跳跃连接能够更易于生成具有高可迁移性的对抗样本。具体而言,在类ResNet模型(具备跳跃连接)中,我们发现通过衰减因子抑制残差模块的梯度,同时使反向传播偏向于利用跳跃连接的梯度,能够构造出具有高可迁移性的对抗样本。基于这一洞见,我们提出了跳跃梯度方法(SGM)。虽然该方法最初源于视觉领域的类ResNet模型,但我们进一步将SGM扩展至更先进的架构,包括视觉Transformer(ViT)、具有可变长度路径的模型以及其他领域如自然语言处理。我们针对多种模型族开展了全面的基于迁移的攻击实验,涵盖ResNet、Transformer、Inception、基于神经架构搜索的模型以及大语言模型(LLM)。结果表明,在几乎所有情况下,采用SGM均能显著提升所构造攻击的可迁移性。此外,我们证明了SGM在更具挑战性的场景下依然有效,例如基于集成的攻击、定向攻击以及针对具备防御机制的模型。最后,我们从理论解释和实证分析两个角度阐明了SGM的作用机理。本研究不仅为探索模型架构特性的对抗性研究提供了新思路,也为安全模型架构设计提出了进一步挑战。代码已开源:https://github.com/mo666666/SGM。