Diffusion-based generative models (DBGMs) perturb data to a target noise distribution and reverse this process to generate samples. The choice of noising process, or inference diffusion process, affects both likelihoods and sample quality. For example, extending the inference process with auxiliary variables leads to improved sample quality. While there are many such multivariate diffusions to explore, each new one requires significant model-specific analysis, hindering rapid prototyping and evaluation. In this work, we study Multivariate Diffusion Models (MDMs). For any number of auxiliary variables, we provide a recipe for maximizing a lower-bound on the MDMs likelihood without requiring any model-specific analysis. We then demonstrate how to parameterize the diffusion for a specified target noise distribution; these two points together enable optimizing the inference diffusion process. Optimizing the diffusion expands easy experimentation from just a few well-known processes to an automatic search over all linear diffusions. To demonstrate these ideas, we introduce two new specific diffusions as well as learn a diffusion process on the MNIST, CIFAR10, and ImageNet32 datasets. We show learned MDMs match or surpass bits-per-dims (BPDs) relative to fixed choices of diffusions for a given dataset and model architecture.
翻译:基于扩散的生成模型(DBGMs)通过将数据扰动至目标噪声分布并逆转这一过程来生成样本。噪声化过程(即推断扩散过程)的选择同时影响似然值与样本质量。例如,引入辅助变量扩展推断过程可提升样本质量。尽管存在众多此类多变量扩散机制可供探索,但每种新方法均需进行特定模型分析,严重阻碍快速原型开发与评估。本研究系统探究多变量扩散模型(MDMs),针对任意数量的辅助变量,我们提出无需模型特定分析即可最大化MDMs似然下界的通用方法;进而阐明如何针对指定目标噪声分布参数化扩散过程——两者结合共同实现推断扩散过程的优化。扩散优化将简易实验范围从少数已知过程扩展至所有线性扩散的自动搜索。为验证上述设想,我们提出两种新型扩散机制,并在MNIST、CIFAR10及ImageNet32数据集上学习扩散过程。实验表明,在给定数据集与模型架构下,相较于固定扩散选择,学习型MDMs在每维度比特数(BPDs)指标上达到或超越现有水平。