Stein variational gradient descent (SVGD) [Liu and Wang, 2016] performs approximate Bayesian inference by representing the posterior with a set of particles. However, SVGD suffers from variance collapse, i.e. poor predictions due to underestimating uncertainty [Ba et al., 2021], even for moderately-dimensional models such as small Bayesian neural networks (BNNs). To address this issue, we generalize SVGD by letting each particle parameterize a component distribution in a mixture model. Our method, Stein Mixture Inference (SMI), optimizes a lower bound to the evidence (ELBO) and introduces user-specified guides parameterized by particles. SMI extends the Nonlinear SVGD framework [Wang and Liu, 2019] to the case of variational Bayes. SMI effectively avoids variance collapse, judging by a previously described test developed for this purpose, and performs well on standard data sets. In addition, SMI requires considerably fewer particles than SVGD to accurately estimate uncertainty for small BNNs. The synergistic combination of NSVGD, ELBO optimization and user-specified guides establishes a promising approach towards variational Bayesian inference in the case of tall and wide data.
翻译:Stein变分梯度下降(SVGD)[Liu and Wang, 2016]通过一组粒子表示后验分布,实现近似贝叶斯推断。然而,即使对于中等维度的模型(如小型贝叶斯神经网络),SVGD仍存在方差坍缩问题,即因低估不确定性而导致预测性能下降[Ba et al., 2021]。为解决此问题,我们通过让每个粒子参数化混合模型中的分量分布来推广SVGD。我们提出的Stein混合推断(SMI)方法优化证据下界(ELBO),并引入由粒子参数化的用户指定引导分布。SMI将非线性SVGD框架[Wang and Liu, 2019]扩展至变分贝叶斯场景。经专门设计的测试评估表明,SMI能有效避免方差坍缩,并在标准数据集上表现良好。此外,对于小型贝叶斯神经网络,SMI所需粒子数显著少于SVGD,即可准确估计不确定性。非线性SVGD、ELBO优化与用户指定引导分布的协同结合,为高维大数据场景下的变分贝叶斯推断建立了具有前景的研究路径。