Recent studies have shown that Deep Neural Networks (DNNs) are susceptible to adversarial attacks, with frequency-domain analysis underscoring the significance of high-frequency components in influencing model predictions. Conversely, targeting low-frequency components has been effective in enhancing attack transferability on black-box models. In this study, we introduce a frequency decomposition-based feature mixing method to exploit these frequency characteristics in both clean and adversarial samples. Our findings suggest that incorporating features of clean samples into adversarial features extracted from adversarial examples is more effective in attacking normally-trained models, while combining clean features with the adversarial features extracted from low-frequency parts decomposed from the adversarial samples yields better results in attacking defense models. However, a conflict issue arises when these two mixing approaches are employed simultaneously. To tackle the issue, we propose a cross-frequency meta-optimization approach comprising the meta-train step, meta-test step, and final update. In the meta-train step, we leverage the low-frequency components of adversarial samples to boost the transferability of attacks against defense models. Meanwhile, in the meta-test step, we utilize adversarial samples to stabilize gradients, thereby enhancing the attack's transferability against normally trained models. For the final update, we update the adversarial sample based on the gradients obtained from both meta-train and meta-test steps. Our proposed method is evaluated through extensive experiments on the ImageNet-Compatible dataset, affirming its effectiveness in improving the transferability of attacks on both normally-trained CNNs and defense models. The source code is available at https://github.com/WJJLL/MetaSSA.
翻译:近期研究表明,深度神经网络易受对抗攻击影响,而频域分析凸显了高频成分在影响模型预测中的重要性。相反,针对低频成分的攻击方法在增强黑箱模型攻击迁移性方面表现有效。本研究提出一种基于频率分解的特征混合方法,旨在利用干净样本与对抗样本中的频率特性。实验发现:将干净样本特征融入对抗样本提取的对抗特征,对攻击常规训练模型更为有效;而将干净特征与对抗样本低频成分中提取的对抗特征结合,则在攻击防御模型时表现更优。然而,当这两种混合策略同时使用时会出现冲突问题。为解决该问题,我们提出跨频率元优化方法,包含元训练、元测试及最终更新三个阶段。在元训练阶段,利用对抗样本的低频成分增强对防御模型的攻击迁移性;在元测试阶段,则通过对抗样本稳定梯度以提升对常规训练模型的攻击迁移性;最终更新阶段基于元训练与元测试所得梯度共同更新对抗样本。我们在ImageNet兼容数据集上开展大量实验验证了该方法在提升对常规训练CNN及防御模型攻击迁移性方面的有效性。源代码已开源至https://github.com/WJJLL/MetaSSA。