Recent successful generative models are trained by fitting a neural network to an a-priori defined tractable probability density path taking noise to training examples. In this paper we investigate the space of Gaussian probability paths, which includes diffusion paths as an instance, and look for an optimal member in some useful sense. In particular, minimizing the Kinetic Energy (KE) of a path is known to make particles' trajectories simple, hence easier to sample, and empirically improve performance in terms of likelihood of unseen data and sample generation quality. We investigate Kinetic Optimal (KO) Gaussian paths and offer the following observations: (i) We show the KE takes a simplified form on the space of Gaussian paths, where the data is incorporated only through a single, one dimensional scalar function, called the \emph{data separation function}. (ii) We characterize the KO solutions with a one dimensional ODE. (iii) We approximate data-dependent KO paths by approximating the data separation function and minimizing the KE. (iv) We prove that the data separation function converges to $1$ in the general case of arbitrary normalized dataset consisting of $n$ samples in $d$ dimension as $n/\sqrt{d}\rightarrow 0$. A consequence of this result is that the Conditional Optimal Transport (Cond-OT) path becomes \emph{kinetic optimal} as $n/\sqrt{d}\rightarrow 0$. We further support this theory with empirical experiments on ImageNet.
翻译:近期成功的生成模型通过将神经网络拟合到预定义的可处理概率密度路径(从噪声过渡到训练样本)进行训练。本文研究了包含扩散路径在内的高斯概率路径空间,并寻找某种实用意义上的最优成员。特别地,最小化路径的动力学能量可使粒子轨迹简化从而易于采样,并在经验上提升未见数据的似然性与样本生成质量。我们研究了动力学最优高斯路径并提出以下发现:(i) 证明在高斯路径空间中动力学能量可简化为仅通过单变量标量函数(称为数据分离函数)整合数据的形式;(ii) 利用一维常微分方程刻画动力学最优解的特征;(iii) 通过近似数据分离函数并最小化动力学能量来逼近数据相关的动力学最优路径;(iv) 证明当 $n/\sqrt{d}\rightarrow 0$ 时,由 $d$ 维空间中 $n$ 个样本构成的任意归一化数据集的数据分离函数收敛到 $1$。该结论的一个推论是:当 $n/\sqrt{d}\rightarrow 0$ 时,条件最优传输路径具有动力学最优性。我们通过在ImageNet上的实验进一步验证了这一理论。