Parallel tempering is meta-algorithm for Markov Chain Monte Carlo that uses multiple chains to sample from tempered versions of the target distribution, enhancing mixing in multi-modal distributions that are challenging for traditional methods. The effectiveness of parallel tempering is heavily influenced by the selection of chain temperatures. Here, we present an adaptive temperature selection algorithm that dynamically adjusts temperatures during sampling using a policy gradient approach. Experiments demonstrate that our method can achieve lower integrated autocorrelation times compared to traditional geometrically spaced temperatures and uniform acceptance rate schemes on benchmark distributions.
翻译:并行回火是一种马尔可夫链蒙特卡罗元算法,它通过多个链对目标分布的退火版本进行采样,从而改善传统方法难以处理的多峰分布中的混合效率。并行回火的效果在很大程度上受链温度选择的影响。本文提出一种自适应温度选择算法,该算法在采样过程中采用策略梯度方法动态调整温度。实验表明,在基准分布上,与传统几何间隔温度方案和均匀接受率方案相比,我们的方法能够实现更低的积分自相关时间。