Despite the widespread use of the data augmentation (DA) algorithm, the theoretical understanding of its convergence behavior remains incomplete. We prove the first non-asymptotic polynomial upper bounds on mixing times of three important DA algorithms: DA algorithm for Bayesian Probit regression (Albert and Chib, 1993, ProbitDA), Bayesian Logit regression (Polson, Scott, and Windle, 2013, LogitDA), and Bayesian Lasso regression (Park and Casella, 2008, Rajaratnam et al., 2015, LassoDA). Concretely, we demonstrate that with $\eta$-warm start, parameter dimension $d$, and sample size $n$, the ProbitDA and LogitDA require $\mathcal{O}\left(nd\log \left(\frac{\log \eta}{\epsilon}\right)\right)$ steps to obtain samples with at most $\epsilon$ TV error, whereas the LassoDA requires $\mathcal{O}\left(d^2(d\log d +n \log n)^2 \log \left(\frac{\eta}{\epsilon}\right)\right)$ steps. The results are generally applicable to settings with large $n$ and large $d$, including settings with highly imbalanced response data in the Probit and Logit regression. The proofs are based on the Markov chain conductance and isoperimetric inequalities. Assuming that data are independently generated from either a bounded, sub-Gaussian, or log-concave distribution, we improve the guarantees for ProbitDA and LogitDA to $\tilde{\mathcal{O}}(n+d)$ with high probability, and compare it with the best known guarantees of Langevin Monte Carlo and Metropolis Adjusted Langevin Algorithm. We also discuss the mixing times of the three algorithms under feasible initialization.
翻译:尽管数据增强(DA)算法被广泛使用,但其收敛行为的理论理解仍不完整。我们首次证明了三种重要DA算法混合时间的非渐近多项式上界:贝叶斯Probit回归的DA算法(Albert与Chib,1993,ProbitDA)、贝叶斯Logit回归的DA算法(Polson、Scott与Windle,2013,LogitDA)以及贝叶斯Lasso回归的DA算法(Park与Casella,2008,Rajaratnam等人,2015,LassoDA)。具体而言,我们证明在具有$\eta$热启动、参数维度$d$和样本量$n$的条件下,ProbitDA和LogitDA需要$\mathcal{O}\left(nd\log \left(\frac{\log \eta}{\epsilon}\right)\right)$步以获得至多$\epsilon$总变差误差的样本,而LassoDA则需要$\mathcal{O}\left(d^2(d\log d +n \log n)^2 \log \left(\frac{\eta}{\epsilon}\right)\right)$步。这些结果普遍适用于大$n$和大$d$的场景,包括Probit和Logit回归中响应数据高度不平衡的情况。证明基于马尔可夫链电导和等周不等式。假设数据独立生成于有界分布、亚高斯分布或对数凹分布,我们以高概率将ProbitDA和LogitDA的保证改进为$\tilde{\mathcal{O}}(n+d)$,并与朗之万蒙特卡洛算法及Metropolis调整朗之万算法的最佳已知保证进行比较。我们还讨论了三种算法在可行初始化下的混合时间。