Variational inference has recently emerged as a popular alternative to the classical Markov chain Monte Carlo (MCMC) in large-scale Bayesian inference. The core idea is to trade statistical accuracy for computational efficiency. In this work, we study these statistical and computational trade-offs in variational inference via a case study in inferential model selection. Focusing on Gaussian inferential models (or variational approximating families) with diagonal plus low-rank precision matrices, we initiate a theoretical study of the trade-offs in two aspects, Bayesian posterior inference error and frequentist uncertainty quantification error. From the Bayesian posterior inference perspective, we characterize the error of the variational posterior relative to the exact posterior. We prove that, given a fixed computation budget, a lower-rank inferential model produces variational posteriors with a higher statistical approximation error, but a lower computational error; it reduces variance in stochastic optimization and, in turn, accelerates convergence. From the frequentist uncertainty quantification perspective, we consider the precision matrix of the variational posterior as an uncertainty estimate, which involves an additional statistical error originating from the sampling uncertainty of the data. As a consequence, for small datasets, the inferential model need not be full-rank to achieve optimal estimation error (even with unlimited computation budget).
翻译:变分推理近年来在大规模贝叶斯推理中成为经典马尔可夫链蒙特卡洛方法的热门替代方案,其核心思想是以统计精度换取计算效率。本文通过推断模型选择的案例研究,系统探讨变分推理中的统计与计算权衡。我们聚焦于采用对角加低秩精度矩阵的高斯推断模型(即变分近似族),从贝叶斯后验推理误差与频率派不确定性量化误差两个维度展开理论分析。在贝叶斯后验推理视角下,我们刻画了变分后验相对于精确后验的误差特征,证明在固定计算预算下,低秩推断模型产生的变分后验具有更高的统计近似误差,但计算误差更低;该模型通过降低随机优化中的方差加速收敛。在频率派不确定性量化视角下,我们将变分后验的精度矩阵视为不确定性估计量,其额外包含源于数据采样不确定性的统计误差。因此,对于小样本数据,即使计算预算无限,推断模型也无需达到满秩即可实现最优估计误差。