The vulnerability of deep neural networks to adversarial samples has been a major impediment to their broad applications, despite their success in various fields. Recently, some works suggested that adversarially-trained models emphasize the importance of low-frequency information to achieve higher robustness. While several attempts have been made to leverage this frequency characteristic, they have all faced the issue that applying low-pass filters directly to input images leads to irreversible loss of discriminative information and poor generalizability to datasets with distinct frequency features. This paper presents a plug-and-play module called the Frequency Preference Control Module that adaptively reconfigures the low- and high-frequency components of intermediate feature representations, providing better utilization of frequency in robust learning. Empirical studies show that our proposed module can be easily incorporated into any adversarial training framework, further improving model robustness across different architectures and datasets. Additionally, experiments were conducted to examine how the frequency bias of robust models impacts the adversarial training process and its final robustness, revealing interesting insights.
翻译:深度神经网络对对抗样本的脆弱性一直是其广泛应用的主要障碍,尽管它在各个领域取得了成功。近期,一些研究表明,经对抗训练的模型会强调低频信息的重要性以实现更高的鲁棒性。尽管已有若干尝试利用这一频率特性,但均面临直接对输入图像应用低通滤波器会导致判别信息不可逆损失、以及对具有不同频率特征的数据集泛化性差的难题。本文提出了一种即插即用模块——频率偏好控制模块,该模块自适应地重新配置中间特征表示中的低频与高频成分,从而在鲁棒学习中更优地利用频率信息。实证研究表明,我们提出的模块可轻松集成至任何对抗训练框架中,进一步提升不同架构与数据集下模型的鲁棒性。此外,实验还探究了鲁棒模型的频率偏好如何影响对抗训练过程及其最终鲁棒性,揭示了富有洞见的发现。