Diffusion models (DMs) represent state-of-the-art generative models for continuous inputs. DMs work by constructing a Stochastic Differential Equation (SDE) in the input space (ie, position space), and using a neural network to reverse it. In this work, we introduce a novel generative modeling framework grounded in \textbf{phase space dynamics}, where a phase space is defined as {an augmented space encompassing both position and velocity.} Leveraging insights from Stochastic Optimal Control, we construct a path measure in the phase space that enables efficient sampling. {In contrast to DMs, our framework demonstrates the capability to generate realistic data points at an early stage of dynamics propagation.} This early prediction sets the stage for efficient data generation by leveraging additional velocity information along the trajectory. On standard image generation benchmarks, our model yields favorable performance over baselines in the regime of small Number of Function Evaluations (NFEs). Furthermore, our approach rivals the performance of diffusion models equipped with efficient sampling techniques, underscoring its potential as a new tool generative modeling.
翻译:扩散模型(DMs)代表了连续输入领域最先进的生成式模型。DMs通过在输入空间(即位置空间)中构建随机微分方程(SDE),并利用神经网络对其进行逆向求解。本文提出了一种基于**相空间动力学**的新型生成式建模框架,其中相空间被定义为包含位置和速度的增广空间。借助随机最优控制的见解,我们在相空间中构建了一种路径测度,从而能够实现高效采样。与扩散模型相比,我们的框架能够在动力学演化的早期阶段生成逼真的数据点。这种早期预测为利用轨迹上的额外速度信息实现高效数据生成奠定了基础。在标准图像生成基准测试中,我们的模型在函数评估次数(NFEs)较少的情况下优于基线模型。此外,我们的方法可与配备高效采样技术的扩散模型相媲美,彰显了其作为生成式建模新工具的潜力。