Diffusion models can be parameterized in terms of either score or energy function. The energy parameterization is attractive as it enables sampling procedures such as Markov Chain Monte Carlo (MCMC) that incorporates a Metropolis--Hastings (MH) correction step based on energy differences between proposed samples. Such corrections can significantly improve sampling quality, particularly in the context of model composition, where pre-trained models are combined to generate samples from novel distributions. Score-based diffusion models, on the other hand, are more widely adopted and come with a rich ecosystem of pre-trained models. However, they do not, in general, define an underlying energy function, making MH-based sampling inapplicable. In this work, we address this limitation by retaining score parameterization and introducing a novel MH-like acceptance rule based on line integration of the score function. This allows the reuse of existing diffusion models while still combining the reverse process with various MCMC techniques, viewed as an instance of annealed MCMC. Through experiments on synthetic and real-world data, we show that our MH-like samplers {yield relative improvements of similar magnitude to those observed} with energy-based models, without requiring explicit energy parameterization.
翻译:扩散模型可根据分数函数或能量函数进行参数化。能量参数化具有吸引力,因为它能够实现马尔可夫链蒙特卡洛(MCMC)等采样过程,其中包含基于提议样本间能量差异的梅特罗波利斯-黑斯廷斯(MH)校正步骤。此类校正可显著提升采样质量,尤其在模型组合场景中——即通过组合预训练模型从新分布生成样本。相比之下,基于分数的扩散模型应用更为广泛,并拥有丰富的预训练模型生态系统,但通常无法定义底层能量函数,导致基于MH的采样不适用。本研究通过保留分数参数化,并引入一种基于分数函数线积分的类MH接受规则来解决这一局限性。这使得在保留现有扩散模型的同时,可将逆向过程与多种MCMC技术相结合(视为退火MCMC的实例)。通过合成数据与真实数据实验表明,我们的类MH采样器在无需显式能量参数化的条件下,能够获得与基于能量模型相当量级的相对改进效果。