Automatic differentiation variational inference (ADVI) offers fast and easy-to-use posterior approximation in multiple modern probabilistic programming languages. However, its stochastic optimizer lacks clear convergence criteria and requires tuning parameters. Moreover, ADVI inherits the poor posterior uncertainty estimates of mean-field variational Bayes (MFVB). We introduce ``deterministic ADVI'' (DADVI) to address these issues. DADVI replaces the intractable MFVB objective with a fixed Monte Carlo approximation, a technique known in the stochastic optimization literature as the ``sample average approximation'' (SAA). By optimizing an approximate but deterministic objective, DADVI can use off-the-shelf second-order optimization, and, unlike standard mean-field ADVI, is amenable to more accurate posterior covariances via linear response (LR). In contrast to existing worst-case theory, we show that, on certain classes of common statistical problems, DADVI and the SAA can perform well with relatively few samples even in very high dimensions, though we also show that such favorable results cannot extend to variational approximations that are too expressive relative to mean-field ADVI. We show on a variety of real-world problems that DADVI reliably finds good solutions with default settings (unlike ADVI) and, together with LR covariances, is typically faster and more accurate than standard ADVI.
翻译:自动微分变分推断(ADVI)在多种现代概率编程语言中提供了快速且易于使用的后验近似方法。然而,其随机优化器缺乏明确的收敛准则且需要调参。此外,ADVI继承了均值场变分贝叶斯(MFVB)后验不确定性估计较差的缺陷。我们提出"确定性ADVI"(DADVI)来解决上述问题。DADVI通过固定蒙特卡洛近似替代难以处理的MFVB目标,这一技术称为"样本平均近似"(SAA)。通过优化近似但确定性的目标函数,DADVI可应用现成的二阶优化方法,且与标准均值场ADVI不同,其通过线性响应(LR)能获得更精确的后验协方差估计。与现有最坏情形理论相反,我们证明在特定类别的常见统计问题中,DADVI和SAA即使在高维空间也能用相对较少的样本取得良好表现,但同时也验证了这类优越结果无法推广至相较于均值场ADVI表达性过强的变分近似。通过多种实际问题验证,我们发现DADVI在默认设置下即可稳定找到优质解(与ADVI不同),且结合LR协方差估计后,其通常比标准ADVI更快、更精确。