Efficient sampling from un-normalized target distributions is pivotal in scientific computing and machine learning. While neural samplers have demonstrated potential with a special emphasis on sampling efficiency, existing neural implicit samplers still have issues such as poor mode covering behavior, unstable training dynamics, and sub-optimal performances. To tackle these issues, in this paper, we introduce Denoising Fisher Training (DFT), a novel training approach for neural implicit samplers with theoretical guarantees. We frame the training problem as an objective of minimizing the Fisher divergence by deriving a tractable yet equivalent loss function, which marks a unique theoretical contribution to assessing the intractable Fisher divergences. DFT is empirically validated across diverse sampling benchmarks, including two-dimensional synthetic distribution, Bayesian logistic regression, and high-dimensional energy-based models (EBMs). Notably, in experiments with high-dimensional EBMs, our best one-step DFT neural sampler achieves results on par with MCMC methods with up to 200 sampling steps, leading to a substantially greater efficiency over 100 times higher. This result not only demonstrates the superior performance of DFT in handling complex high-dimensional sampling but also sheds light on efficient sampling methodologies across broader applications.
翻译:从非归一化目标分布中高效采样是科学计算与机器学习中的关键问题。尽管神经采样器在采样效率方面展现出潜力,但现有的神经隐式采样器仍存在模式覆盖不足、训练动态不稳定及性能欠佳等问题。为解决这些问题,本文提出了一种具有理论保证的新型训练方法——去噪费舍尔训练(DFT)。通过推导出可处理且等效的损失函数,我们将训练问题构建为最小化费舍尔散度的优化目标,这为评估难以处理的费舍尔散度提供了独特的理论贡献。DFT在多种采样基准测试中进行了实证验证,包括二维合成分布、贝叶斯逻辑回归和高维能量基模型(EBMs)。值得注意的是,在高维EBM实验中,我们最优的单步DFT神经采样器取得了与多达200步采样的MCMC方法相当的结果,实现了超过100倍的显著效率提升。这一结果不仅证明了DFT在处理复杂高维采样问题上的卓越性能,也为更广泛应用中的高效采样方法提供了新的启示。