Struggle with Adversarial Defense? Try Diffusion

Adversarial attacks induce misclassification by introducing subtle perturbations. Recently, diffusion models are applied to the image classifiers to improve adversarial robustness through adversarial training or by purifying adversarial noise. However, diffusion-based adversarial training often encounters convergence challenges and high computational expenses. Additionally, diffusion-based purification inevitably causes data shift and is deemed susceptible to stronger adaptive attacks. To tackle these issues, we propose the Truth Maximization Diffusion Classifier (TMDC), a generative Bayesian classifier that builds upon pre-trained diffusion models and the Bayesian theorem. Unlike data-driven classifiers, TMDC, guided by Bayesian principles, utilizes the conditional likelihood from diffusion models to determine the class probabilities of input images, thereby insulating against the influences of data shift and the limitations of adversarial training. Moreover, to enhance TMDC's resilience against more potent adversarial attacks, we propose an optimization strategy for diffusion classifiers. This strategy involves post-training the diffusion model on perturbed datasets with ground-truth labels as conditions, guiding the diffusion model to learn the data distribution and maximizing the likelihood under the ground-truth labels. The proposed method achieves state-of-the-art performance on the CIFAR10 dataset against heavy white-box attacks and strong adaptive attacks. Specifically, TMDC achieves robust accuracies of 82.81% against $l_{\infty}$ norm-bounded perturbations and 86.05% against $l_{2}$ norm-bounded perturbations, respectively, with $\epsilon=0.05$.

翻译：对抗攻击通过引入微小扰动导致分类错误。近年来，扩散模型被应用于图像分类器，通过对抗训练或净化对抗噪声来提升鲁棒性。然而，基于扩散的对抗训练常面临收敛困难和高计算成本的问题。此外，基于扩散的净化方法不可避免地导致数据偏移，且被认为易受更强的自适应攻击影响。为解决这些问题，我们提出真相最大化扩散分类器（TMDC），这是一种基于预训练扩散模型和贝叶斯定理的生成式贝叶斯分类器。与数据驱动型分类器不同，TMDC以贝叶斯原理为指导，利用扩散模型的条件似然确定输入图像的类别概率，从而隔绝数据偏移的影响和对抗训练的限制。此外，为增强TMDC对更强对抗攻击的抵抗能力，我们提出一种扩散分类器优化策略。该策略在带有真实标签条件的扰动数据集上对扩散模型进行后训练，引导扩散模型学习数据分布并最大化真实标签下的似然。该方法在CIFAR10数据集上对强白盒攻击和强自适应攻击取得了最先进性能。具体而言，在$\epsilon=0.05$条件下，TMDC对$l_{\infty}$范数有界扰动达到82.81%的鲁棒准确率，对$l_{2}$范数有界扰动达到86.05%的鲁棒准确率。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日