The scope of this paper is generative modeling through diffusion processes. An approach falling within this paradigm is the work of Song et al. (2021), which relies on a time-reversal argument to construct a diffusion process targeting the desired data distribution. We show that the time-reversal argument, common to all denoising diffusion probabilistic modeling proposals, is not necessary. We obtain diffusion processes targeting the desired data distribution by taking appropriate mixtures of diffusion bridges. The resulting transport is exact by construction, allows for greater flexibility in choosing the dynamics of the underlying diffusion, and can be approximated by means of a neural network via novel training objectives. We develop a unifying view of the drift adjustments corresponding to our and to time-reversal approaches and make use of this representation to inspect the inner workings of diffusion-based generative models. Finally, we leverage on scalable simulation and inference techniques common in spatial statistics to move beyond fully factorial distributions in the underlying diffusion dynamics. The methodological advances contained in this work contribute toward establishing a general framework for generative modeling based on diffusion processes.
翻译:本文的研究范围是通过扩散过程进行生成建模。属于这一范式的方法之一是Song等人(2021)的工作,它依赖于时间反转论证来构建一个以期望数据分布为目标的扩散过程。我们表明,所有去噪扩散概率建模方案中普遍采用的时间反转论证并非必要。通过适当混合扩散桥,我们可以获得以期望数据分布为目标的扩散过程。由此产生的传输过程在构造上是精确的,允许在底层扩散动力学选择上具有更大的灵活性,并且可以通过新颖的训练目标借助神经网络进行近似。我们发展了一种统一视角来审视对应于我们方法及时间反转方法中的漂移调整,并利用这一表示来探究基于扩散的生成模型的内部机制。最后,我们借助空间统计中常见的可扩展模拟与推断技术,突破了底层扩散动力学中全因子分布的局限。本文所包含的方法论进展有助于建立基于扩散过程生成建模的通用框架。