Adversarial transferability enables black-box attacks on unknown victim deep neural networks (DNNs), rendering attacks viable in real-world scenarios. Current transferable attacks create adversarial perturbation over the entire image, resulting in excessive noise that overfit the source model. Concentrating perturbation to dominant image regions that are model-agnostic is crucial to improving adversarial efficacy. However, limiting perturbation to local regions in the spatial domain proves inadequate in augmenting transferability. To this end, we propose a transferable adversarial attack with fine-grained perturbation optimization in the frequency domain, creating centralized perturbation. We devise a systematic pipeline to dynamically constrain perturbation optimization to dominant frequency coefficients. The constraint is optimized in parallel at each iteration, ensuring the directional alignment of perturbation optimization with model prediction. Our approach allows us to centralize perturbation towards sample-specific important frequency features, which are shared by DNNs, effectively mitigating source model overfitting. Experiments demonstrate that by dynamically centralizing perturbation on dominating frequency coefficients, crafted adversarial examples exhibit stronger transferability, and allowing them to bypass various defenses.
翻译:对抗迁移性使得对未知目标深度神经网络(DNNs)的黑盒攻击成为可能,从而实现了在真实场景中的攻击可行性。当前的可迁移攻击在整个图像上生成对抗扰动,导致过量的噪声过度拟合源模型。将扰动集中在与模型无关的主导图像区域是提升对抗效能的关键。然而,在空间域中将扰动限制在局部区域不足以增强迁移性。为此,我们提出一种基于频域细粒度扰动优化的可迁移对抗攻击方法,构建集中式扰动。我们设计了一套系统化流程,动态地将扰动优化约束在主导频率系数上。该约束在每次迭代中并行优化,确保扰动优化的方向与模型预测方向对齐。该方法能够将扰动集中在样本特定的重要频率特征上,这些特征为深度神经网络所共享,从而有效缓解源模型过拟合。实验表明,通过动态地将扰动集中在主导频率系数上,生成的对抗样本展现出更强的迁移性,并能绕过多种防御机制。