Automatic differentiation variational inference (ADVI) offers fast and easy-to-use posterior approximation in multiple modern probabilistic programming languages. However, its stochastic optimizer lacks clear convergence criteria and requires tuning parameters. Moreover, ADVI inherits the poor posterior uncertainty estimates of mean-field variational Bayes (MFVB). We introduce "deterministic ADVI" (DADVI) to address these issues. DADVI replaces the intractable MFVB objective with a fixed Monte Carlo approximation, a technique known in the stochastic optimization literature as the "sample average approximation" (SAA). By optimizing an approximate but deterministic objective, DADVI can use off-the-shelf second-order optimization, and, unlike standard mean-field ADVI, is amenable to more accurate posterior covariances via linear response (LR). In contrast to existing worst-case theory, we show that, on certain classes of common statistical problems, DADVI and the SAA can perform well with relatively few samples even in very high dimensions, though we also show that such favorable results cannot extend to variational approximations that are too expressive relative to mean-field ADVI. We show on a variety of real-world problems that DADVI reliably finds good solutions with default settings (unlike ADVI) and, together with LR covariances, is typically faster and more accurate than standard ADVI.
翻译:自动微分变分推断(ADVI)在多种现代概率编程语言中提供了快速且易用的后验近似方法。然而,其随机优化器缺乏清晰的收敛判据且需要调整参数。此外,ADVI继承了平均场变分贝叶斯(MFVB)后验不确定性估计较差的缺陷。我们提出"确定性ADVI"(DADVI)来解决这些问题。DADVI将不可处理的MFVB目标替换为固定的蒙特卡洛近似,该技术在随机优化文献中称为"样本平均近似"(SAA)。通过优化近似但确定性的目标,DADVI可使用现成的二阶优化方法,且与标准平均场ADVI不同,可通过线性响应(LR)实现更准确的后验协方差估计。与现有最坏情形理论不同,我们证明在特定类别的常见统计问题中,即使在高维场景下,DADVI和SAA也能用相对较少的样本表现良好,但我们同时证明这类有利结果无法扩展到相对平均场ADVI过于灵活的变分近似。我们通过多个实际问题的实验表明,DADVI(不同于ADVI)在默认设置下能可靠地找到优质解,且结合LR协方差估计时通常比标准ADVI更快速、更准确。