Traditional neural networks are simple to train but they typically produce overconfident predictions. In contrast, Bayesian neural networks provide good uncertainty quantification but optimizing them is time consuming due to the large parameter space. This paper proposes to combine the advantages of both approaches by performing Variational Inference in the Final layer Output space (VIFO), because the output space is much smaller than the parameter space. We use neural networks to learn the mean and the variance of the probabilistic output. Like standard, non-Beyesian models, VIFO enjoys simple training and one can use Rademacher complexity to provide risk bounds for the model. On the other hand, using the Bayesian formulation we incorporate collapsed variational inference with VIFO which significantly improves the performance in practice. Experiments show that VIFO and ensembles of VIFO provide a good tradeoff in terms of run time and uncertainty quantification, especially for out of distribution data.
翻译:传统神经网络易于训练,但通常会产生过度自信的预测。相比之下,贝叶斯神经网络能提供良好的不确定性量化,但因参数空间庞大导致其优化过程耗时。本文提出通过在最终层输出空间进行变分推断(VIFO)来融合两种方法的优势,因为输出空间远小于参数空间。我们利用神经网络学习概率输出的均值和方差。与标准非贝叶斯模型类似,VIFO享有简易的训练过程,且可采用Rademacher复杂度为模型提供风险界。另一方面,基于贝叶斯公式,我们将折叠变分推断与VIFO相结合,显著提升了实际性能。实验表明,VIFO及其集成模型在运行时间和不确定性量化之间实现了良好的平衡,尤其适用于分布外数据。