Regenerative Rejection Sampling

This thesis presents Regenerative Rejection Sampling (RRS), a novel approximate sampling algorithm inspired by classical Rejection Sampling and Markov Chain Monte Carlo methods. The method constructs a continuous-time regenerative process whose stationary distribution coincides with a target density known only up to a normalizing constant. Unlike standard Rejection Sampling, RRS does not require the existence of a finite constant that upper-bounds the likelihood ratio. As a result, its total variation convergence rate remains exponential for a larger class of scenarios compared to, for example, the Independent Metropolis-Hastings sampler, which requires a finite bounding constant. To explain the workings of the method, we first present a detailed review of renewal and regenerative processes, including their limit theorems, stationary versions, and convergence properties under standard conditions. We explain a coupling proof for exponential convergence of regenerative processes, under the assumption of a spread-out cycle length distribution. We then introduce the RRS algorithm, and derive its convergence rate. Its performance is compared theoretically and empirically with classical MCMC methods. Numerical experiments demonstrate that RRS can exhibit lower autocorrelations and faster effective mixing, both in synthetic examples and in a Bayesian probit regression model applied to a real medical dataset. Moreover, if the algorithm is run until time t, we show that the usual order $O(1/t)$ results for the bias of the time-average estimators, is improved to a bias of $O(1/t^2)$ for the estimator constructed from the RRS method, and provide easy-to-estimate non-asymptotic bounds for this bias.

翻译：本文提出再生拒绝采样（RRS），一种受经典拒绝采样和马尔可夫链蒙特卡洛方法启发的新型近似采样算法。该方法构建了一个连续时间的再生过程，其平稳分布与仅知归一化常数的目标密度一致。与标准拒绝采样不同，RRS不需要存在有限常数来上界似然比。因此，其全变差收敛速率在更大类场景下仍保持指数级，例如独立Metropolis-Hastings采样器需要有限有界常数。为解释该方法的工作原理，我们首先详细回顾更新过程和再生过程，包括其极限定理、平稳版本以及在标准条件下的收敛性质。在扩散循环长度分布的假设下，我们阐释了再生过程指数收敛的耦合证明。随后介绍RRS算法并推导其收敛速率。其性能在理论上和实验上与经典MCMC方法进行比较。数值实验表明，RRS在合成示例和应用于真实医学数据集的贝叶斯probit回归模型中均能表现出更低的自相关性和更快的有效混合。此外，如果算法运行至时间t，我们证明时间平均估计量的通常$O(1/t)$阶偏差，会改进为RRS方法构造的估计量的$O(1/t^2)$阶偏差，并为该偏差提供易于估计的非渐近界。