EGC: Image Generation and Classification via a Single Energy-Based Model

Learning image classification and image generation using the same set of network parameters is a challenging problem. Recent advanced approaches perform well in one task often exhibit poor performance in the other. This work introduces an energy-based classifier and generator, namely EGC, which can achieve superior performance in both tasks using a single neural network. Unlike a conventional classifier that outputs a label given an image (i.e., a conditional distribution $p(y|\mathbf{x})$), the forward pass in EGC is a classifier that outputs a joint distribution $p(\mathbf{x},y)$, enabling an image generator in its backward pass by marginalizing out the label $y$. This is done by estimating the energy and classification probability given a noisy image in the forward pass, while denoising it using the score function estimated in the backward pass. EGC achieves competitive generation results compared with state-of-the-art approaches on ImageNet-1k, CelebA-HQ and LSUN Church, while achieving superior classification accuracy and robustness against adversarial attacks on CIFAR-10. This work represents the first successful attempt to simultaneously excel in both tasks using a single set of network parameters. We believe that EGC bridges the gap between discriminative and generative learning.

翻译：使用同一组网络参数同时学习图像分类与图像生成是一项具有挑战性的问题。现有先进方法在某一任务上表现优异时，往往在另一任务上表现欠佳。本文提出一种基于能量的分类与生成模型EGC，该模型通过单一神经网络即可在两任务中实现卓越性能。与传统分类器根据图像输出标签（即条件分布$p(y|\mathbf{x})$）不同，EGC的前向过程是一个输出联合分布$p(\mathbf{x},y)$的分类器，通过在后向过程中对标签$y$进行边缘化处理实现图像生成。具体而言，前向过程通过估计含噪图像的能量与分类概率，而逆向过程则利用估计的得分函数对图像进行去噪。在ImageNet-1k、CelebA-HQ和LSUN Church数据集上，EGC取得了与最先进方法相媲美的生成效果，同时在CIFAR-10数据集上展现出优越的分类精度与对抗攻击鲁棒性。本工作首次成功实现了使用单一网络参数集同时优异完成两项任务，我们认为EGC弥合了判别式学习与生成式学习之间的鸿沟。

相关内容

分类器

关注 6

分类是数据挖掘的一种非常重要的方法。分类的概念是在已有数据的基础上学会一个分类函数或构造出一个分类模型（即我们通常所说的分类器(Classifier)）。该函数或模型能够把数据库中的数据纪录映射到给定类别中的某一个，从而可以应用于数据预测。总之，分类器是数据挖掘中对样本进行分类的方法的统称，包含决策树、逻辑回归、朴素贝叶斯、神经网络等算法。

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

专知会员服务

20+阅读 · 2020年6月23日

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

专知会员服务

34+阅读 · 2020年6月19日

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日