Automatic differentiation variational inference (ADVI) offers fast and easy-to-use posterior approximation in multiple modern probabilistic programming languages. However, its stochastic optimizer lacks clear convergence criteria and requires tuning parameters. Moreover, ADVI inherits the poor posterior uncertainty estimates of mean-field variational Bayes (MFVB). We introduce "deterministic ADVI" (DADVI) to address these issues. DADVI replaces the intractable MFVB objective with a fixed Monte Carlo approximation, a technique known in the stochastic optimization literature as the "sample average approximation" (SAA). By optimizing an approximate but deterministic objective, DADVI can use off-the-shelf second-order optimization, and, unlike standard mean-field ADVI, is amenable to more accurate posterior covariances via linear response (LR). In contrast to existing worst-case theory, we show that, on certain classes of common statistical problems, DADVI and the SAA can perform well with relatively few samples even in very high dimensions, though we also show that such favorable results cannot extend to variational approximations that are too expressive relative to mean-field ADVI. We show on a variety of real-world problems that DADVI reliably finds good solutions with default settings (unlike ADVI) and, together with LR covariances, is typically faster and more accurate than standard ADVI.
翻译:自动微分变分推断(ADVI)能在多种现代概率编程语言中提供快速且易于使用的后验近似。然而,其随机优化器缺乏清晰的收敛准则且需要调节参数。此外,ADVI继承了均值场变分贝叶斯(MFVB)后验不确定性估计较差的缺陷。我们提出"确定性ADVI"(DADVI)以解决这些问题。DADVI用固定蒙特卡洛近似替代难以处理的MFVB目标函数,这一技术在随机优化文献中称为"样本平均近似"(SAA)。通过优化近似但确定性的目标函数,DADVI可直接使用现成的二阶优化方法,并且与标准均值场ADVI不同,它能通过线性响应(LR)获得更精确的后验协方差。不同于现有最坏情况理论,我们证明在特定类别的常见统计问题中,DADVI与SAA即使在极高维空间也能用较少样本表现良好,但同时也表明这样的有利结果无法推广到比均值场ADVI更具表达力的变分近似。我们在多种实际问题上验证,DADVI无需调节参数(不同于ADVI)即可可靠找到优质解,结合LR协方差后,通常比标准ADVI更快且更准确。