In this work, we look at Score-based generative models (also called diffusion generative models) from a geometric perspective. From a new view point, we prove that both the forward and backward process of adding noise and generating from noise are Wasserstein gradient flow in the space of probability measures. We are the first to prove this connection. Our understanding of Score-based (and Diffusion) generative models have matured and become more complete by drawing ideas from different fields like Bayesian inference, control theory, stochastic differential equation and Schrodinger bridge. However, many open questions and challenges remain. One problem, for example, is how to decrease the sampling time? We demonstrate that looking from geometric perspective enables us to answer many of these questions and provide new interpretations to some known results. Furthermore, geometric perspective enables us to devise an intuitive geometric solution to the problem of faster sampling. By augmenting traditional score-based generative models with a projection step, we show that we can generate high quality images with significantly fewer sampling-steps.
翻译:本文从几何视角审视基于得分的生成模型(亦称扩散生成模型)。我们首次从全新角度证明,添加噪声的前向过程与从噪声生成的后向过程均为概率测度空间中的Wasserstein梯度流。通过融合贝叶斯推断、控制理论、随机微分方程及薛定谔桥等不同领域的理论,我们对基于得分(及扩散)生成模型的理解已日趋成熟完善。然而,仍有许多悬而未决的问题与挑战,例如如何缩短采样时间?研究表明,几何视角不仅能解答这些疑问,还为某些已知结论提供了全新诠释。更重要的是,几何视角使我们能够设计出实现快速采样的直观几何解决方案。通过在传统基于得分的生成模型中引入投影步骤,我们证明可以用显著更少的采样步数生成高质量的图像。