Stein thinning is a promising algorithm proposed by (Riabiz et al., 2022) for post-processing outputs of Markov chain Monte Carlo (MCMC). The main principle is to greedily minimize the kernelized Stein discrepancy (KSD), which only requires the gradient of the log-target distribution, and is thus well-suited for Bayesian inference. The main advantages of Stein thinning are the automatic remove of the burn-in period, the correction of the bias introduced by recent MCMC algorithms, and the asymptotic properties of convergence towards the target distribution. Nevertheless, Stein thinning suffers from several empirical pathologies, which may result in poor approximations, as observed in the literature. In this article, we conduct a theoretical analysis of these pathologies, to clearly identify the mechanisms at stake, and suggest improved strategies. Then, we introduce the regularized Stein thinning algorithm to alleviate the identified pathologies. Finally, theoretical guarantees and extensive experiments show the high efficiency of the proposed algorithm. An implementation of regularized Stein thinning as the kernax library in python and JAX is available at https://gitlab.com/drti/kernax.
翻译:斯坦稀疏化是由Riabiz等人(2022年)提出的一种有前景的算法,用于对马尔可夫链蒙特卡洛(MCMC)输出进行后处理。其主要原理是贪心地最小化核化斯坦判别(KSD),该方法仅需对数目标分布的梯度,因此非常适合贝叶斯推理。斯坦稀疏化的主要优势在于自动去除预热期、纠正近期MCMC算法引入的偏差,以及渐近收敛于目标分布的特性。然而,斯坦稀疏化存在若干经验性病理问题,可能导致近似效果不佳,文献中已有相关观察。本文对这些病理问题进行理论分析,以明确其作用机制,并提出改进策略。随后,我们引入正则化斯坦稀疏化算法以缓解已识别的病理问题。最后,理论保证与大量实验表明所提算法具有高效性。正则化斯坦稀疏化算法的实现(作为python和JAX中的kernax库)可在https://gitlab.com/drti/kernax获取。