Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-body systems, is a foundational problem in science. In this paper, we propose Iterated Denoising Energy Matching (iDEM), an iterative algorithm that uses a novel stochastic score matching objective leveraging solely the energy function and its gradient -- and no data samples -- to train a diffusion-based sampler. Specifically, iDEM alternates between (I) sampling regions of high model density from a diffusion-based sampler and (II) using these samples in our stochastic matching objective to further improve the sampler. iDEM is scalable to high dimensions as the inner matching objective, is simulation-free, and requires no MCMC samples. Moreover, by leveraging the fast mode mixing behavior of diffusion, iDEM smooths out the energy landscape enabling efficient exploration and learning of an amortized sampler. We evaluate iDEM on a suite of tasks ranging from standard synthetic energy functions to invariant $n$-body particle systems. We show that the proposed approach achieves state-of-the-art performance on all metrics and trains $2-5\times$ faster, which allows it to be the first method to train using energy on the challenging $55$-particle Lennard-Jones system.
翻译:从非归一化概率分布(如多体系统的平衡样本)中高效生成统计独立样本是科学中的一个基础性问题。本文提出迭代去噪能量匹配(iDEM),这是一种迭代算法,通过一种新颖的随机分数匹配目标,仅利用能量函数及其梯度——无需数据样本——来训练基于扩散的采样器。具体而言,iDEM交替执行以下步骤:(I)从基于扩散的采样器中采样模型高密度区域;(II)将这些样本用于我们的随机匹配目标以进一步改进采样器。iDEM可扩展至高维空间,因为其内部匹配目标无需模拟且不需要MCMC样本。此外,通过利用扩散的快速模态混合行为,iDEM能够平滑能量景观,从而实现高效探索并学习一个摊销化的采样器。我们在从标准合成能量函数到不变$n$体粒子系统的一系列任务上评估iDEM。结果表明,所提方法在所有指标上均达到最先进性能,且训练速度加快$2-5$倍,这使其成为首个在具有挑战性的$55$粒子Lennard-Jones系统上利用能量进行训练的方法。