Deep neural networks are known to be vulnerable to adversarial examples crafted by adding human-imperceptible perturbations to the benign input. After achieving nearly 100% attack success rates in white-box setting, more focus is shifted to black-box attacks, of which the transferability of adversarial examples has gained significant attention. In either case, the common gradient-based methods generally use the sign function to generate perturbations on the gradient update, that offers a roughly correct direction and has gained great success. But little work pays attention to its possible limitation. In this work, we observe that the deviation between the original gradient and the generated noise may lead to inaccurate gradient update estimation and suboptimal solutions for adversarial transferability. To this end, we propose a Sampling-based Fast Gradient Rescaling Method (S-FGRM). Specifically, we use data rescaling to substitute the sign function without extra computational cost. We further propose a Depth First Sampling method to eliminate the fluctuation of rescaling and stabilize the gradient update. Our method could be used in any gradient-based attacks and is extensible to be integrated with various input transformation or ensemble methods to further improve the adversarial transferability. Extensive experiments on the standard ImageNet dataset show that our method could significantly boost the transferability of gradient-based attacks and outperform the state-of-the-art baselines.
翻译:深度神经网络已知易受对抗样本攻击,这类样本通过在良性输入中添加人眼不可察觉的扰动生成。在白盒场景下实现近乎100%的攻击成功率后,研究重点转向黑盒攻击,其中对抗样本的迁移性受到广泛关注。无论哪种场景,基于梯度的常用方法普遍采用符号函数在梯度更新中生成扰动,该方法能提供大致正确的梯度方向并取得了显著成功。然而现有研究极少关注其潜在局限性。本工作发现,原始梯度与生成噪声之间的偏差可能导致梯度更新估计不准确,进而获得对抗迁移性的次优解。为此,我们提出基于采样的快速梯度缩放方法(S-FGRM)。具体而言,采用数据缩放替代符号函数,且不增加额外计算成本。进一步提出深度优先采样方法消除缩放波动并稳定梯度更新。本方法可应用于任意基于梯度的攻击方法,并可扩展集成多种输入变换或集成方法,以进一步提升对抗迁移性。在标准ImageNet数据集上的大量实验表明,本方法能显著提升基于梯度的攻击的迁移性,并超越当前最优基准方法。