Bayesian neural networks (BNNs) provide a formalism to quantify and calibrate uncertainty in deep learning. Current inference approaches for BNNs often resort to few-sample estimation for scalability, which can harm predictive performance, while its alternatives tend to be computationally prohibitively expensive. We tackle this challenge by revealing a previously unseen connection between inference on BNNs and volume computation problems. With this observation, we introduce a novel collapsed inference scheme that performs Bayesian model averaging using collapsed samples. It improves over a Monte-Carlo sample by limiting sampling to a subset of the network weights while pairing it with some closed-form conditional distribution over the rest. A collapsed sample represents uncountably many models drawn from the approximate posterior and thus yields higher sample efficiency. Further, we show that the marginalization of a collapsed sample can be solved analytically and efficiently despite the non-linearity of neural networks by leveraging existing volume computation solvers. Our proposed use of collapsed samples achieves a balance between scalability and accuracy. On various regression and classification tasks, our collapsed Bayesian deep learning approach demonstrates significant improvements over existing methods and sets a new state of the art in terms of uncertainty estimation as well as predictive performance.
翻译:贝叶斯神经网络(BNNs)为深度学习中的不确定性量化与校准提供了形式化方法。当前BNNs的主流推断方法常借助少量样本估计以实现可扩展性,但这种做法可能损害预测性能,而其替代方案往往面临计算成本过高的问题。我们通过揭示BNNs推断与体积计算问题之间先前未被发现的联系,解决了这一挑战。基于这一发现,我们提出了一种新型折叠推断方案,通过使用折叠样本执行贝叶斯模型平均。该方法通过将采样限制在部分网络权重上,同时对其余权重采用闭式条件分布,从而提升蒙特卡洛样本的效能。每个折叠样本代表从近似后验分布中抽取的不可数无穷多个模型,因此具有更高的样本效率。此外,我们证明尽管神经网络具有非线性特性,但通过利用现有体积计算求解器,仍可解析且高效地求解折叠样本的边缘化问题。我们所提出的折叠样本使用方案在可扩展性与准确性之间取得了平衡。在各类回归与分类任务中,我们的折叠贝叶斯深度学习方法相比现有方法展现了显著改进,并在不确定性估计与预测性能方面均达到了新的最优水平。