Recent advancements in masked image modeling (MIM) have made it a prevailing framework for self-supervised visual representation learning. The MIM pretrained models, like most deep neural network methods, remain vulnerable to adversarial attacks, limiting their practical application, and this issue has received little research attention. In this paper, we investigate how this powerful self-supervised learning paradigm can provide adversarial robustness to downstream classifiers. During the exploration, we find that noisy image modeling (NIM), a simple variant of MIM that adopts denoising as the pre-text task, reconstructs noisy images surprisingly well despite severe corruption. Motivated by this observation, we propose an adversarial defense method, referred to as De^3, by exploiting the pretrained decoder for denoising. Through De^3, NIM is able to enhance adversarial robustness beyond providing pretrained features. Furthermore, we incorporate a simple modification, sampling the noise scale hyperparameter from random distributions, and enable the defense to achieve a better and tunable trade-off between accuracy and robustness. Experimental results demonstrate that, in terms of adversarial robustness, NIM is superior to MIM thanks to its effective denoising capability. Moreover, the defense provided by NIM achieves performance on par with adversarial training while offering the extra tunability advantage. Source code and models are available at https://github.com/youzunzhi/NIM-AdvDef.
翻译:近期在掩码图像建模(MIM)方面的进展使其成为自监督视觉表示学习的主流框架。与大多数深度神经网络方法类似,MIM预训练模型仍然容易受到对抗攻击,限制了其实际应用,而这一问题至今鲜有研究关注。本文探究了这种强大的自监督学习范式如何为下游分类器提供对抗鲁棒性。在探索过程中,我们发现采用去噪作为预文本任务的MIM简单变体——噪声图像建模(NIM),即便在严重损坏的情况下也能出色地重建噪声图像。受此观察启发,我们提出了一种名为De^3的对抗防御方法,该方法利用预训练解码器进行去噪。通过De^3,NIM能够在提供预训练特征之外进一步增强对抗鲁棒性。此外,我们引入了一项简单改进——从随机分布中采样噪声尺度超参数——使防御能够在准确性与鲁棒性之间实现更优且可调的权衡。实验结果表明,在对抗鲁棒性方面,NIM凭借其有效的去噪能力优于MIM。同时,NIM提供的防御性能可与对抗训练相媲美,并额外具备可调的优势。源代码和模型发布于 https://github.com/youzunzhi/NIM-AdvDef。