We introduce a class of generative models based on the stochastic interpolant framework proposed in Albergo & Vanden-Eijnden (2023) that unifies flow-based and diffusion-based methods. We first show how to construct a broad class of continuous-time stochastic processes whose time-dependent probability density function bridges two arbitrary densities exactly in finite time. These `stochastic interpolants' are built by combining data from the two densities with an additional latent variable, and the specific details of the construction can be leveraged to shape the resulting time-dependent density in a flexible way. We then show that the time-dependent density of the stochastic interpolant satisfies a first-order transport equation as well as a family of forward and backward Fokker-Planck equations with tunable diffusion; upon consideration of the time evolution of an individual sample, this viewpoint immediately leads to both deterministic and stochastic generative models based on probability flow equations or stochastic differential equations with a tunable level of noise. The drift coefficients entering these models are time-dependent velocity fields characterized as the unique minimizers of simple quadratic objective functions, one of which is a new objective for the score of the interpolant density. Remarkably, we show that minimization of these quadratic objectives leads to control of the likelihood for generative models built upon stochastic dynamics; by contrast, we show that generative models based upon a deterministic dynamics must, in addition, control the Fisher divergence between the target and the model. Finally, we construct estimators for the likelihood and the cross-entropy of interpolant-based generative models, and demonstrate that such models recover the Schr\"odinger bridge between the two target densities when explicitly optimizing over the interpolant.
翻译:本文基于Albergo和Vanden-Eijnden(2023)提出的随机插值框架,引入了一类生成模型,该框架统一了基于流和基于扩散的方法。我们首先展示了如何构建一类广义连续时间随机过程,其时间依赖的概率密度函数能在有限时间内精确桥接任意两个密度。这些“随机插值”通过结合两个密度的数据与一个额外的潜变量构建而成,且构建的具体细节可灵活塑造所得的时间依赖密度。随后,我们证明了随机插值的时间依赖密度满足一阶输运方程,以及一系列具有可调扩散的前向和后向福克-普朗克方程;通过考虑单个样本的时间演化,该观点直接导出了基于概率流方程或具有可调噪声水平的随机微分方程的确定性及随机生成模型。这些模型中的漂移系数是时间依赖的速度场,其被刻画为简单二次目标函数的唯一极小化子,其中一项是针对插值密度分数的新目标。值得注意的是,我们证明最小化这些二次目标可控制基于随机动力学构建的生成模型的似然;对比之下,基于确定性动力学的生成模型则必须额外控制目标与模型之间的Fisher散度。最后,我们构建了插值型生成模型的似然与交叉熵估计量,并证明当明确对插值进行优化时,此类模型可恢复两个目标密度之间的施罗丁格桥。