Exploring Frequencies via Feature Mixing and Meta-Learning for Improving Adversarial Transferability

Recent studies have shown that Deep Neural Networks (DNNs) are susceptible to adversarial attacks, with frequency-domain analysis underscoring the significance of high-frequency components in influencing model predictions. Conversely, targeting low-frequency components has been effective in enhancing attack transferability on black-box models. In this study, we introduce a frequency decomposition-based feature mixing method to exploit these frequency characteristics in both clean and adversarial samples. Our findings suggest that incorporating features of clean samples into adversarial features extracted from adversarial examples is more effective in attacking normally-trained models, while combining clean features with the adversarial features extracted from low-frequency parts decomposed from the adversarial samples yields better results in attacking defense models. However, a conflict issue arises when these two mixing approaches are employed simultaneously. To tackle the issue, we propose a cross-frequency meta-optimization approach comprising the meta-train step, meta-test step, and final update. In the meta-train step, we leverage the low-frequency components of adversarial samples to boost the transferability of attacks against defense models. Meanwhile, in the meta-test step, we utilize adversarial samples to stabilize gradients, thereby enhancing the attack's transferability against normally trained models. For the final update, we update the adversarial sample based on the gradients obtained from both meta-train and meta-test steps. Our proposed method is evaluated through extensive experiments on the ImageNet-Compatible dataset, affirming its effectiveness in improving the transferability of attacks on both normally-trained CNNs and defense models. The source code is available at https://github.com/WJJLL/MetaSSA.

翻译：近期研究表明，深度神经网络易受对抗攻击影响，而频域分析凸显了高频成分在影响模型预测中的重要性。相反，针对低频成分的攻击方法在增强黑箱模型攻击迁移性方面表现有效。本研究提出一种基于频率分解的特征混合方法，旨在利用干净样本与对抗样本中的频率特性。实验发现：将干净样本特征融入对抗样本提取的对抗特征，对攻击常规训练模型更为有效；而将干净特征与对抗样本低频成分中提取的对抗特征结合，则在攻击防御模型时表现更优。然而，当这两种混合策略同时使用时会出现冲突问题。为解决该问题，我们提出跨频率元优化方法，包含元训练、元测试及最终更新三个阶段。在元训练阶段，利用对抗样本的低频成分增强对防御模型的攻击迁移性；在元测试阶段，则通过对抗样本稳定梯度以提升对常规训练模型的攻击迁移性；最终更新阶段基于元训练与元测试所得梯度共同更新对抗样本。我们在ImageNet兼容数据集上开展大量实验验证了该方法在提升对常规训练CNN及防御模型攻击迁移性方面的有效性。源代码已开源至https://github.com/WJJLL/MetaSSA。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/