Image classification and denoising suffer from complementary issues of lack of robustness or partially ignoring conditioning information. We argue that they can be alleviated by unifying both tasks through a model of the joint probability of (noisy) images and class labels. Classification is performed with a forward pass followed by conditioning. Using the Tweedie-Miyasawa formula, we evaluate the denoising function with the score, which can be computed by marginalization and back-propagation. The training objective is then a combination of cross-entropy loss and denoising score matching loss integrated over noise levels. Numerical experiments on CIFAR-10 and ImageNet show competitive classification and denoising performance compared to reference deep convolutional classifiers/denoisers, and significantly improves efficiency compared to previous joint approaches. Our model shows an increased robustness to adversarial perturbations compared to a standard discriminative classifier, and allows for a novel interpretation of adversarial gradients as a difference of denoisers.
翻译:图像分类与去噪任务存在互补性问题:分类模型缺乏鲁棒性,而去噪模型则部分忽略了条件信息。我们认为,通过建立(含噪)图像与类别标签的联合概率模型来统一这两项任务,可以缓解上述问题。分类任务通过前向传递与条件化完成。利用Tweedie-Miyasawa公式,我们通过评分函数计算去噪函数,该评分函数可通过边缘化与反向传播获得。训练目标由交叉熵损失与在噪声水平上积分的去噪评分匹配损失组合而成。在CIFAR-10和ImageNet上的数值实验表明,与参考的深度卷积分类器/去噪器相比,本模型在分类和去噪性能上具有竞争力,且相较于先前的联合方法显著提升了效率。与标准判别式分类器相比,我们的模型对对抗性扰动表现出更强的鲁棒性,并为对抗梯度提供了一种新颖的解释——即去噪器之间的差异。