Improved uncertainty quantification for neural networks with Bayesian last layer

from arxiv, 10 pages, 4 figures, 1 table. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Uncertainty quantification is an essential task in machine learning - a task in which neural networks (NNs) have traditionally not excelled. This can be a limitation for safety-critical applications, where uncertainty-aware methods like Gaussian processes or Bayesian linear regression are often preferred. Bayesian neural networks are an approach to address this limitation. They assume probability distributions for all parameters and yield distributed predictions. However, training and inference are typically intractable and approximations must be employed. A promising approximation is NNs with Bayesian last layer (BLL). They assume distributed weights only in the last linear layer and yield a normally distributed prediction. NNs with BLL can be seen as a Bayesian linear regression model with learned nonlinear features. To approximate the intractable Bayesian neural network, point estimates of the distributed weights in all but the last layer should be obtained by maximizing the marginal likelihood. This has previously been challenging, as the marginal likelihood is expensive to evaluate in this setting and prohibits direct training through backpropagation. We present a reformulation of the log-marginal likelihood of a NN with BLL which allows for efficient training using backpropagation. Furthermore, we address the challenge of quantifying uncertainty for extrapolation points. We provide a metric to quantify the degree of extrapolation and derive a method to improve the uncertainty quantification for these points. Our methods are derived for the multivariate case and demonstrated in a simulation study, where we compare Bayesian linear regression applied to a previously trained neural network with our proposed algorithm

翻译：不确定性量化是机器学习中的一项关键任务，而神经网络在此任务中传统上表现并不突出。对于安全关键型应用而言，这可能成为局限性，此类应用中常优先采用具有不确定性感知能力的方法，如高斯过程或贝叶斯线性回归。贝叶斯神经网络正是为解决这一局限性而提出的方法。该方法假设所有参数服从概率分布，并生成分布形式的预测。然而，其训练和推断通常难以精确求解，必须采用近似方法。一种有前景的近似方法是具有贝叶斯末层的神经网络。该网络仅假设最后一个线性层具有分布形式的权重，从而生成正态分布的预测。具有贝叶斯末层的神经网络可被视为一种以学习到的非线性特征为基础的贝叶斯线性回归模型。为近似难以处理的贝叶斯神经网络，需通过最大化边际似然来获取除末层外所有分布权重点估计。此前这一过程颇具挑战，因为在此框架下边际似然的计算成本高昂，且无法直接通过反向传播进行训练。我们提出了一种具有贝叶斯末层的神经网络的对数边际似然的重构形式，使其能够利用反向传播进行高效训练。此外，我们还解决了外推点不确定性量化的难题。我们提供了一种量化外推程度的度量标准，并推导出一种改善这些点不确定性量化的方法。我们的方法针对多元情形推导，并通过仿真研究进行验证，同时将应用于预训练神经网络的贝叶斯线性回归与我们提出的算法进行比较。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

专知会员服务

34+阅读 · 2020年1月15日