Recent advancements in masked image modeling (MIM) have made it a prevailing framework for self-supervised visual representation learning. The MIM pretrained models, like most deep neural network methods, are still vulnerable to adversarial attacks, limiting their practical application, and this issue has received little research attention. In this paper, we investigate how this powerful self-supervised learning paradigm can provide adversarial robustness to downstream classifiers. During the exploration, we find that noisy image modeling (NIM), a simple variant of MIM that adopts denoising as the pre-text task, reconstructs noisy images surprisingly well despite severe corruption. Motivated by this observation, we propose an adversarial defense method by exploiting the pretrained decoder for denoising, referred to as De^3, through which NIM is able to enhance adversarial robustness beyond providing pretrained features. Furthermore, we incorporate a simple modification, sampling the noise scale hyperparameter from random distributions, and enable the defense to achieve a better and tunable trade-off between accuracy and robustness. Experimental results demonstrate that, in terms of adversarial robustness, NIM is superior compared to MIM thanks to its effective denoising capability. Moreover, the defense provided by NIM achieves performance on par with adversarial training while offering the extra tunability advantage. Source code and models will be made available.
翻译:近期掩码图像建模(MIM)的进展使其成为自监督视觉表征学习的主流框架。与大多数深度神经网络方法类似,MIM预训练模型仍易受对抗性攻击,这限制了其实际应用,且这一问题鲜有研究关注。本文探究了这一强大的自监督学习范式如何为下游分类器提供对抗鲁棒性。在探索过程中,我们发现噪声图像建模(NIM)——一种采用去噪作为前置任务的MIM简单变体——即使面对严重破坏,也能惊人地重建噪声图像。受此观察启发,我们提出一种利用预训练解码器进行去噪的对抗性防御方法,称为De^3,通过该方法,NIM能够超越提供预训练特征,增强对抗鲁棒性。此外,我们引入一项简单改进——从随机分布中采样噪声尺度超参数,使防御在准确性与鲁棒性之间实现更优且可调节的权衡。实验结果表明,在对抗鲁棒性方面,NIM由于具备有效的去噪能力而优于MIM。同时,NIM提供的防御在性能上与对抗训练相当,并额外具备可调节性的优势。源代码和模型将公开提供。