Over the past few years, several approaches utilizing score-based diffusion have been proposed to sample from probability distributions, that is without having access to exact samples and relying solely on evaluations of unnormalized densities. The resulting samplers approximate the time-reversal of a noising diffusion process, bridging the target distribution to an easy-to-sample base distribution. In practice, the performance of these methods heavily depends on key hyperparameters that require ground truth samples to be accurately tuned. Our work aims to highlight and address this fundamental issue, focusing in particular on multi-modal distributions, which pose significant challenges for existing sampling methods. Building on existing approaches, we introduce Learned Reference-based Diffusion Sampler (LRDS), a methodology specifically designed to leverage prior knowledge on the location of the target modes in order to bypass the obstacle of hyperparameter tuning. LRDS proceeds in two steps by (i) learning a reference diffusion model on samples located in high-density space regions and tailored for multimodality, and (ii) using this reference model to foster the training of a diffusion-based sampler. We experimentally demonstrate that LRDS best exploits prior knowledge on the target distribution compared to competing algorithms on a variety of challenging distributions.
翻译:近年来,基于分数的扩散方法已被提出用于从概率分布中采样,即无需获取精确样本,仅依赖非归一化密度函数的评估。这些采样器通过近似噪声扩散过程的时序反转,将目标分布与易于采样的基分布相连接。实际上,这些方法的性能在很大程度上依赖于关键超参数,而这些参数需要真实样本进行精确调整。本研究旨在强调并解决这一根本问题,特别关注对现有采样方法构成显著挑战的多模态分布。基于现有方法,我们提出了基于学习的参考扩散采样器(LRDS),该方法专门设计用于利用目标模态位置先验知识,以规避超参数调整的障碍。LRDS通过两个步骤实现:(i)学习一个基于高密度空间区域样本并针对多模态特性定制的参考扩散模型;(ii)利用该参考模型促进基于扩散的采样器训练。我们通过实验证明,在多种具有挑战性的分布上,与竞争算法相比,LRDS能更有效地利用目标分布的先验知识。