Sampling from unnormalized densities is analogous to the generative modeling problem, but the target distribution is defined by a known energy function instead of data samples. Because evaluating the energy function is often costly, a primary challenge is to learn an efficient sampler. We introduce Flow Sampling, a framework built on diffusion models and flow matching for the data-free setting. Our training objective is conditioned on a noise sample and regresses onto a denoising diffusion drift constructed from the energy function. In contrast, diffusion models' objective is conditioned on a data sample and regresses onto a noising diffusion drift. We utilize the interpolant process to minimize the number of energy function evaluations during training, resulting in an efficient and scalable method for sampling unnormalized densities. Furthermore, our formulation naturally extends to Riemannian manifolds, enabling diffusion-based sampling in geometries beyond Euclidean space. We derive a closed-form formula for the conditional drift on constant curvature manifolds, including hyperspheres and hyperbolic spaces. We evaluate Flow Sampling on synthetic energy benchmarks, small peptides, large-scale amortized molecular conformer generation, and distributions supported on the sphere, demonstrating strong empirical performance.
翻译:从非归一化密度中采样类似于生成建模问题,但目标分布由已知能量函数而非数据样本定义。由于评估能量函数通常成本高昂,一个关键挑战是学习高效的采样器。我们提出流采样(Flow Sampling),这是一个基于扩散模型和流匹配的无数据框架。其训练目标以噪声样本为条件,回归到由能量函数构造的去噪扩散漂移。相比之下,扩散模型的目标以数据样本为条件,回归到加噪扩散漂移。我们利用插值过程最小化训练期间的能量函数评估次数,从而得到一种高效且可扩展的非归一化密度采样方法。此外,我们的公式自然延伸至黎曼流形,支持欧氏空间之外几何结构上的扩散采样。我们推导出常曲率流形(包括超球面和双曲空间)上条件漂移的闭合形式公式。我们在合成能量基准、小肽分子、大规模摊销分子构象生成以及球面上分布上评估流采样,展示了强劲的实证表现。