Stochastic Gradient (SG) Markov Chain Monte Carlo algorithms (MCMC) are popular algorithms for Bayesian sampling in the presence of large datasets. However, they come with little theoretical guarantees and assessing their empirical performances is non-trivial. In such context, it is crucial to develop algorithms that are robust to the choice of hyperparameters and to gradients heterogeneity since, in practice, both the choice of step-size and behaviour of target gradients induce hard-to-control biases in the invariant distribution. In this work we introduce the stochastic gradient Barker dynamics (SGBD) algorithm, extending the recently developed Barker MCMC scheme, a robust alternative to Langevin-based sampling algorithms, to the stochastic gradient framework. We characterize the impact of stochastic gradients on the Barker transition mechanism and develop a bias-corrected version that, under suitable assumptions, eliminates the error due to the gradient noise in the proposal. We illustrate the performance on a number of high-dimensional examples, showing that SGBD is more robust to hyperparameter tuning and to irregular behavior of the target gradients compared to the popular stochastic gradient Langevin dynamics algorithm.
翻译:随机梯度(SG)马尔可夫链蒙特卡洛算法(MCMC)是处理大规模数据时进行贝叶斯采样的流行算法。然而,这类算法缺乏充分的理论保证,且评估其经验性能并非易事。在此背景下,开发对超参数选择和梯度异质性具有鲁棒性的算法至关重要,因为在实际应用中,步长选择和目标梯度行为都会在不变分布中引入难以控制的偏差。本文提出了随机梯度巴克动力学(SGBD)算法,将最近开发的巴克MCMC方案——一种基于朗之万采样的鲁棒替代算法——扩展到随机梯度框架中。我们刻画了随机梯度对巴克转移机制的影响,并开发了一个偏差修正版本,在适当假设下能消除提议分布中由梯度噪声引起的误差。我们通过多个高维示例展示了算法的性能,结果表明,与流行的随机梯度朗之万动力学算法相比,SGBD对超参数调优和目标梯度不规则行为具有更强的鲁棒性。