Along with recent diffusion models, randomized smoothing has become one of a few tangible approaches that offers adversarial robustness to models at scale, e.g., those of large pre-trained models. Specifically, one can perform randomized smoothing on any classifier via a simple "denoise-and-classify" pipeline, so-called denoised smoothing, given that an accurate denoiser is available - such as diffusion model. In this paper, we investigate the trade-off between accuracy and certified robustness of denoised smoothing: for example, we question on which representation of diffusion model would maximize the certified robustness of denoised smoothing. We consider a new objective that aims collective robustness of smoothed classifiers across multiple noise levels at a shared diffusion model, which also suggests a new way to compensate the cost of accuracy in randomized smoothing for its certified robustness. This objective motivates us to fine-tune diffusion model (a) to perform consistent denoising whenever the original image is recoverable, but (b) to generate rather diverse outputs otherwise. Our experiments show that this fine-tuning scheme of diffusion models combined with the multi-scale smoothing enables a strong certified robustness possible at highest noise level while maintaining the accuracy closer to non-smoothed classifiers.
翻译:随着最近的扩散模型发展,随机平滑已成为为数不多的可在大规模模型(例如大型预训练模型)中实现对抗鲁棒性的具体方法之一。具体而言,在拥有精确去噪器(如扩散模型)的前提下,可以通过简单的"去噪后分类"流程(即去噪平滑)对任意分类器执行随机平滑。本文研究了去噪平滑的准确性与认证鲁棒性之间的权衡:例如,我们探讨扩散模型的哪种表示能够最大化去噪平滑的认证鲁棒性。我们提出一个新目标,旨在通过共享扩散模型,使跨多个噪声水平的平滑分类器实现集体鲁棒性,这也为在随机平滑中补偿认证鲁棒性所需的准确性代价提供了新思路。该目标促使我们对扩散模型进行微调:(a)在原始图像可恢复时执行一致的去噪,(b)否则生成多样化的输出。实验表明,这种扩散模型微调方案与多尺度平滑相结合,能在保持非平滑分类器相近准确性的同时,在最高噪声水平下实现强认证鲁棒性。