Deep neural networks (DNNs) are known to be susceptible to adversarial examples, leading to significant performance degradation. In black-box attack scenarios, a considerable attack performance gap between the surrogate model and the target model persists. This work focuses on enhancing the transferability of adversarial examples to narrow this performance gap. We observe that the gradient information around the clean image, i.e., Neighbourhood Gradient Information (NGI), can offer high transferability.Based on this insight, we introduce NGI-Attack, incorporating Example Backtracking and Multiplex Mask strategies to exploit this gradient information and enhance transferability. Specifically, we first adopt Example Backtracking to accumulate Neighbourhood Gradient Information as the initial momentum term. Then, we utilize Multiplex Mask to form a multi-way attack strategy that forces the network to focus on non-discriminative regions, which can obtain richer gradient information during only a few iterations. Extensive experiments demonstrate that our approach significantly enhances adversarial transferability. Especially, when attacking numerous defense models, we achieve an average attack success rate of 95.2%. Notably, our method can seamlessly integrate with any off-the-shelf algorithm, enhancing their attack performance without incurring extra time costs.
翻译:深度神经网络(DNNs)已知易受对抗样本攻击,导致性能显著下降。在黑盒攻击场景中,代理模型与目标模型之间存在显著的攻击性能差距。本研究致力于增强对抗样本的可迁移性以缩小这一差距。我们观察到,干净图像周围的梯度信息,即邻域梯度信息(NGI),能够提供较高的可迁移性。基于这一发现,我们提出了NGI-Attack方法,结合示例回溯与多重掩码策略来利用该梯度信息并提升可迁移性。具体而言,我们首先采用示例回溯来累积邻域梯度信息作为初始动量项;随后,利用多重掩码构建多路径攻击策略,迫使网络关注非判别性区域,从而在少量迭代中获得更丰富的梯度信息。大量实验表明,我们的方法显著增强了对抗样本的可迁移性。尤其在攻击多种防御模型时,我们实现了平均95.2%的攻击成功率。值得注意的是,本方法可与任何现有算法无缝集成,在不增加额外时间成本的前提下提升其攻击性能。