Black-box variational inference (BBVI) is a methodology for posterior approximation that relies on stochastic optimization. In practice, the stochastic optimizers underpinning BBVI generally require extensive problem-specific tuning, which undermines its promise as a truly "black box" inference algorithm. However, over the past decade, many new adaptive stochastic optimization algorithms have been developed that reduce or remove entirely the need for tuning. In this work, we investigate this new collection of adaptive methods in the context of BBVI, with the goal of establishing the current state of the art in tuning-free optimization-based inference. In particular, we present a large-scale empirical evaluation of 56 stochastic gradient-based optimization algorithms applied to 1092 Bayesian inference optimization problems, involving over 550,000 individual optimization runs and 15 core-years of compute. The optimization algorithms we evaluate are chosen to represent a wide spectrum of recent approaches and the benchmark problems are chosen to span a range of difficulty, with posterior target dimension 1-10^4, condition number 1-10^8, and a range of variational families. Our results show that no single method dominates, but running a selection of 5 algorithms suffices to reliably get close to the best-possible observed performance. We thus provide a strong baseline for applications where expert tuning is not possible and for comparison when developing new stochastic optimization algorithms.
翻译:黑箱变分推断(BBVI)是一种依赖于随机优化的后验近似方法。在实践中,支撑BBVI的随机优化器通常需要针对具体问题进行大量调优,这削弱了其作为真正"黑箱"推理算法的承诺。然而,过去十年间,许多新型自适应随机优化算法被开发出来,这些算法减少或完全消除了调优需求。本研究针对BBVI情境下的这一新型自适应方法集合展开探索,旨在确立当前免调优优化推理领域的最新技术水平。具体而言,我们对应用于1092个贝叶斯推理优化问题的56种随机梯度优化算法进行了大规模实证评估,涵盖超过55万次独立优化运行和15个核心年的计算量。所选优化算法代表了近期方法的广泛谱系,而基准问题则覆盖了从简单到困难的梯度范围:后验目标维度为1-10^4,条件数为1-10^8,并包含多种变分族。研究结果表明,虽然不存在单一主导方法,但运行5种精选算法组合便足以稳定接近观测到的最佳性能。由此,我们为无法实现专家调优的应用场景以及新型随机优化算法的比较研究提供了强有力的基准。