Joint machine learning models that allow synthesizing and classifying data often offer uneven performance between those tasks or are unstable to train. In this work, we depart from a set of empirical observations that indicate the usefulness of internal representations built by contemporary deep diffusion-based generative models not only for generating but also predicting. We then propose to extend the vanilla diffusion model with a classifier that allows for stable joint end-to-end training with shared parameterization between those objectives. The resulting joint diffusion model outperforms recent state-of-the-art hybrid methods in terms of both classification and generation quality on all evaluated benchmarks. On top of our joint training approach, we present how we can directly benefit from shared generative and discriminative representations by introducing a method for visual counterfactual explanations.
翻译:联合机器学习模型通常难以在数据合成与分类任务之间取得均衡性能,或存在训练不稳定的问题。本研究基于一系列实验观察发现,当代基于深度扩散的生成模型所构建的内部表征不仅可用于数据生成,亦对预测任务具有实用价值。为此,我们提出在标准扩散模型基础上扩展一个分类器,通过共享参数化实现两类目标函数的稳定端到端联合训练。实验结果表明,所提出的联合扩散模型在所有评估基准上,其分类质量与生成质量均优于近期最先进的混合方法。基于这一联合训练框架,我们进一步展示了如何通过引入视觉反事实解释方法,直接利用共享的生成式与判别式表征带来的增益。