Learning the distribution of data on Riemannian manifolds is crucial for modeling data from non-Euclidean space, which is required by many applications from diverse scientific fields. Yet, existing generative models on manifolds suffer from expensive divergence computation or rely on approximations of heat kernel. These limitations restrict their applicability to simple geometries and hinder scalability to high dimensions. In this work, we introduce the Riemannian Diffusion Mixture, a principled framework for building a generative process on manifolds as a mixture of endpoint-conditioned diffusion processes instead of relying on the denoising approach of previous diffusion models, for which the generative process is characterized by its drift guiding toward the most probable endpoint with respect to the geometry of the manifold. We further propose a simple yet efficient training objective for learning the mixture process, that is readily applicable to general manifolds. Our method outperforms previous generative models on various manifolds while scaling to high dimensions and requires a dramatically reduced number of in-training simulation steps for general manifolds.
翻译:在黎曼流形上学习数据分布对于建模非欧几里得空间中的数据至关重要,这是多个科学领域应用的基础需求。然而,现有的流形生成模型面临昂贵的散度计算问题,或依赖于热核近似。这些限制使其仅能应用于简单几何结构,并阻碍了高维场景的可扩展性。本文提出黎曼扩散混合方法——一种通过端点条件扩散过程的混合构建流形生成过程的严谨框架,摒弃了以往扩散模型依赖去噪方法的路径,其生成过程由引导至最可能终点的漂移项(依据流形几何特性)刻画。我们进一步提出一个简洁高效的训练目标来学习该混合过程,可便捷地适用于一般流形。该方法在各类流形上均超越现有生成模型,并具备高维可扩展性,同时显著减少了一般流形在训练中的模拟步数需求。