Stop the Sampler! Classifier-Based Adaptive Stopping for Sampling Kernels

Sampling from complex, unnormalized probability densities is a fundamental challenge in Bayesian inference and probabilistic modeling. While Markov chain Monte Carlo (MCMC) methods provide asymptotic guarantees, they often suffer from slow mixing and high computational costs due to fixed or manually tuned trajectory lengths. In this work, we propose a novel framework that treats trajectory termination as a learnable component of the sampling dynamics. By framing MCMC within the theory of non-acyclic generative flow networks (GFlowNets), we train state-dependent neural classifiers to decide when a trajectory has reached a high-density region and should terminate. We theoretically establish the connection between optimal classifiers and the target density via detailed balance conditions and introduce a multilevel training scheme to facilitate exploration in complex geometries. Experimental results across various benchmark densities demonstrate that our approach significantly reduces average trajectory lengths while improving mode coverage and mixing compared to standard MCMC baselines.

翻译：从复杂、非归一化概率密度中进行采样是贝叶斯推断和概率建模中的基本挑战。尽管马尔可夫链蒙特卡洛方法提供了渐近保证，但由于固定或手动调优的轨迹长度，它们常面临混合缓慢和高计算成本的问题。本文提出了一种新颖框架，将轨迹终止视为采样动力学中可学习的组成部分。通过将马尔可夫链蒙特卡洛置于非循环生成流网络理论中，我们训练状态依赖的神经分类器，用于判断轨迹何时到达高密度区域并应终止。我们通过细致平衡条件从理论上建立了最优分类器与目标密度之间的联系，并引入了一种多级训练方案以促进复杂几何空间中的探索。在多个基准密度上的实验结果表明，与标准马尔可夫链蒙特卡洛基线相比，我们的方法显著减少了平均轨迹长度，同时提升了模式覆盖率和混合效率。

相关内容

分类器

关注 6

分类是数据挖掘的一种非常重要的方法。分类的概念是在已有数据的基础上学会一个分类函数或构造出一个分类模型（即我们通常所说的分类器(Classifier)）。该函数或模型能够把数据库中的数据纪录映射到给定类别中的某一个，从而可以应用于数据预测。总之，分类器是数据挖掘中对样本进行分类的方法的统称，包含决策树、逻辑回归、朴素贝叶斯、神经网络等算法。

《基于非声学传感器的贝叶斯搜索研究》总结报告

专知会员服务

12+阅读 · 5月26日

【博士论文】自适应、鲁棒且可扩展的贝叶斯滤波方法用于在线学习

专知会员服务

10+阅读 · 2025年5月20日

【斯坦福博士论文】基于自适应采样的加速机器学习算法，113页pdf

专知会员服务

27+阅读 · 2023年6月25日

【剑桥大学博士论文】贝叶斯机器学习进展:从不确定性到决策，272页pdf

专知会员服务

83+阅读 · 2023年2月5日