Along with recent diffusion models, randomized smoothing has become one of a few tangible approaches that offers adversarial robustness to models at scale, e.g., those of large pre-trained models. Specifically, one can perform randomized smoothing on any classifier via a simple "denoise-and-classify" pipeline, so-called denoised smoothing, given that an accurate denoiser is available - such as diffusion model. In this paper, we present scalable methods to address the current trade-off between certified robustness and accuracy in denoised smoothing. Our key idea is to "selectively" apply smoothing among multiple noise scales, coined multi-scale smoothing, which can be efficiently implemented with a single diffusion model. This approach also suggests a new objective to compare the collective robustness of multi-scale smoothed classifiers, and questions which representation of diffusion model would maximize the objective. To address this, we propose to further fine-tune diffusion model (a) to perform consistent denoising whenever the original image is recoverable, but (b) to generate rather diverse outputs otherwise. Our experiments show that the proposed multi-scale smoothing scheme combined with diffusion fine-tuning enables strong certified robustness available with high noise level while maintaining its accuracy closer to non-smoothed classifiers.
翻译:随着近期扩散模型的发展,随机平滑已成为少数能够为大规模模型(如大型预训练模型)提供对抗鲁棒性的可行方法之一。具体而言,在拥有高精度去噪器(如扩散模型)的前提下,我们可以通过一个简单的"去噪-分类"流程(即所谓的去噪平滑)对任意分类器进行随机平滑。本文提出了可扩展的方法来解决当前去噪平滑中认证鲁棒性与准确率之间的权衡问题。我们的核心思想是在多个噪声尺度上"选择性"地应用平滑,即所谓的多尺度平滑,该方法可通过单个扩散模型高效实现。这一思路还提出了比较多尺度平滑分类器集体鲁棒性的新目标,并探讨了何种扩散模型表征能够最大化该目标。为此,我们进一步对扩散模型进行微调:(a)在原始图像可恢复时进行一致性去噪,(b)否则生成更多样化的输出。实验表明,所提出的多尺度平滑方案结合扩散模型微调,能够在高噪声水平下实现强认证鲁棒性,同时保持与未平滑分类器更接近的准确率。