Bayesian neural networks (BNNs) provide a formalism to quantify and calibrate uncertainty in deep learning. Current inference approaches for BNNs often resort to few-sample estimation for scalability, which can harm predictive performance, while its alternatives tend to be computationally prohibitively expensive. We tackle this challenge by revealing a previously unseen connection between inference on BNNs and volume computation problems. With this observation, we introduce a novel collapsed inference scheme that performs Bayesian model averaging using collapsed samples. It improves over a Monte-Carlo sample by limiting sampling to a subset of the network weights while pairing it with some closed-form conditional distribution over the rest. A collapsed sample represents uncountably many models drawn from the approximate posterior and thus yields higher sample efficiency. Further, we show that the marginalization of a collapsed sample can be solved analytically and efficiently despite the non-linearity of neural networks by leveraging existing volume computation solvers. Our proposed use of collapsed samples achieves a balance between scalability and accuracy. On various regression and classification tasks, our collapsed Bayesian deep learning approach demonstrates significant improvements over existing methods and sets a new state of the art in terms of uncertainty estimation as well as predictive performance.
翻译:贝叶斯神经网络(BNNs)为深度学习中的不确定性量化与校准提供了形式化框架。现有的BNN推断方法常为可扩展性而采用少量样本估计,但这可能损害预测性能,而替代方案通常计算成本极高。我们通过揭示BNN推断与体积计算问题之间此前未被发现的关联来应对这一挑战。基于这一观察,我们提出一种新颖的折叠推断方案,该方案利用折叠样本执行贝叶斯模型平均。通过将采样限制于网络权重子集,并对剩余权重赋予闭合形式的条件分布,该方案改进了蒙特卡洛样本。折叠样本代表从近似后验中抽取的不可数多个模型,因而具有更高的样本效率。此外,我们证明尽管神经网络具有非线性特性,但通过利用现有体积计算求解器,折叠样本的边际化可以解析且高效地求解。我们提出的折叠样本使用方案在可扩展性与准确性之间取得了平衡。在各类回归与分类任务中,我们的折叠贝叶斯深度学习方法在不确定性估计与预测性能两方面均显著优于现有方法,并树立了新的最先进水平。