Bayesian neural networks (BNN) promise to combine the predictive performance of neural networks with principled uncertainty modeling important for safety-critical systems and decision making. However, posterior uncertainty estimates depend on the choice of prior, and finding informative priors in weight-space has proven difficult. This has motivated variational inference (VI) methods that pose priors directly on the function generated by the BNN rather than on weights. In this paper, we address a fundamental issue with such function-space VI approaches pointed out by Burt et al. (2020), who showed that the objective function (ELBO) is negative infinite for most priors of interest. Our solution builds on generalized VI (Knoblauch et al., 2019) with the regularized KL divergence (Quang, 2019) and is, to the best of our knowledge, the first well-defined variational objective for function-space inference in BNNs with Gaussian process (GP) priors. Experiments show that our method incorporates the properties specified by the GP prior on synthetic and small real-world data sets, and provides competitive uncertainty estimates for regression, classification and out-of-distribution detection compared to BNN baselines with both function and weight-space priors.
翻译:贝叶斯神经网络(BNN)有望将神经网络的预测性能与对安全关键系统和决策至关重要的原则性不确定性建模相结合。然而,后验不确定性估计依赖于先验的选择,而在权重空间中寻找信息性先验已被证明是困难的。这促使了变分推断(VI)方法的发展,这些方法直接将先验置于BNN生成的函数上,而非权重上。在本文中,我们解决了Burt等人(2020)指出的此类函数空间VI方法的一个基本问题,他们表明对于大多数感兴趣的先验,其目标函数(ELBO)是负无穷的。我们的解决方案建立在广义VI(Knoblauch等人,2019)和正则化KL散度(Quang,2019)的基础上,据我们所知,这是首个为具有高斯过程(GP)先验的BNNs函数空间推断定义的良定义变分目标。实验表明,我们的方法在合成和小型真实数据集上融入了GP先验所指定的特性,并且在回归、分类和分布外检测方面,与具有函数和权重空间先验的BNN基线相比,提供了具有竞争力的不确定性估计。