We introduce the Approximated Optimal Transport (AOT) technique, a novel training scheme for diffusion-based generative models. Our approach aims to approximate and integrate optimal transport into the training process, significantly enhancing the ability of diffusion models to estimate the denoiser outputs accurately. This improvement leads to ODE trajectories of diffusion models with lower curvature and reduced truncation errors during sampling. We achieve superior image quality and reduced sampling steps by employing AOT in training. Specifically, we achieve FID scores of 1.88 with just 27 NFEs and 1.73 with 29 NFEs in unconditional and conditional generations, respectively. Furthermore, when applying AOT to train the discriminator for guidance, we establish new state-of-the-art FID scores of 1.68 and 1.58 for unconditional and conditional generations, respectively, each with 29 NFEs. This outcome demonstrates the effectiveness of AOT in enhancing the performance of diffusion models.
翻译:我们提出了一种新颖的训练方法——近似最优传输(AOT)技术,用于扩散生成模型。该方法旨在将最优传输近似并整合到训练过程中,显著提升扩散模型准确估计去噪器输出的能力。这一改进使得扩散模型的常微分方程轨迹曲率更低,采样过程中截断误差减小。通过在训练中采用AOT,我们实现了更优的图像质量和更少的采样步数。具体而言,在无条件生成和条件生成任务中,我们分别以27次函数求值实现了1.88的FID分数,以29次函数求值实现了1.73的FID分数。此外,将AOT应用于训练判别器进行引导时,我们在无条件生成和条件生成任务中分别以29次函数求值取得了1.68和1.58的新最优FID分数。这一结果证明了AOT在提升扩散模型性能方面的有效性。