Deep neural networks (DNNs) are known to be susceptible to adversarial examples, leading to significant performance degradation. In black-box attack scenarios, a considerable attack performance gap between the surrogate model and the target model persists. This work focuses on enhancing the transferability of adversarial examples to narrow this performance gap. We observe that the gradient information around the clean image, i.e. Neighbourhood Gradient Information, can offer high transferability. Leveraging this, we propose the NGI-Attack, which incorporates Example Backtracking and Multiplex Mask strategies, to use this gradient information and enhance transferability fully. Specifically, we first adopt Example Backtracking to accumulate Neighbourhood Gradient Information as the initial momentum term. Multiplex Mask, which forms a multi-way attack strategy, aims to force the network to focus on non-discriminative regions, which can obtain richer gradient information during only a few iterations. Extensive experiments demonstrate that our approach significantly enhances adversarial transferability. Especially, when attacking numerous defense models, we achieve an average attack success rate of 95.8%. Notably, our method can plugin with any off-the-shelf algorithm to improve their attack performance without additional time cost.
翻译:深度神经网络(DNNs)已知易受对抗样本影响,导致性能显著下降。在黑盒攻击场景中,代理模型与目标模型之间仍存在显著的攻击性能差距。本研究致力于提升对抗样本的可迁移性,以缩小这一性能差距。我们观察到,干净图像周围的梯度信息(即邻域梯度信息)能够提供较高的可迁移性。基于此,我们提出了NGI-Attack方法,该方法结合了示例回溯与多重掩码策略,以充分利用该梯度信息并提升可迁移性。具体而言,我们首先采用示例回溯来累积邻域梯度信息作为初始动量项。多重掩码策略构建了一种多路攻击机制,旨在迫使网络关注非判别性区域,从而在仅少数迭代内获取更丰富的梯度信息。大量实验表明,我们的方法显著增强了对抗样本的可迁移性。特别是在攻击多种防御模型时,我们实现了平均95.8%的攻击成功率。值得注意的是,本方法可与任何现有算法即插即用,在不增加额外时间成本的情况下提升其攻击性能。