Uncertainty quantification and robustness to distribution shifts are important goals in machine learning and artificial intelligence. Although Bayesian Neural Networks (BNNs) allow for uncertainty in the predictions to be assessed, different sources of uncertainty are indistinguishable. We present Credal Bayesian Deep Learning (CBDL). Heuristically, CBDL allows to train an (uncountably) infinite ensemble of BNNs, using only finitely many elements. This is possible thanks to prior and likelihood finitely generated credal sets (FGCSs), a concept from the imprecise probability literature. Intuitively, convex combinations of a finite collection of prior-likelihood pairs are able to represent infinitely many such pairs. After training, CBDL outputs a set of posteriors on the parameters of the neural network. At inference time, such posterior set is used to derive a set of predictive distributions that is in turn utilized to distinguish between aleatoric and epistemic uncertainties, and to quantify them. The predictive set also produces either (i) a collection of outputs enjoying desirable probabilistic guarantees, or (ii) the single output that is deemed the best, that is, the one having the highest predictive lower probability -- another imprecise-probabilistic concept. CBDL is more robust than single BNNs to prior and likelihood misspecification, and to distribution shift. We show that CBDL is better at quantifying and disentangling different types of uncertainties than single BNNs, ensemble of BNNs, and Bayesian Model Averaging. In addition, we apply CBDL to two case studies to demonstrate its downstream tasks capabilities: one, for motion prediction in autonomous driving scenarios, and two, to model blood glucose and insulin dynamics for artificial pancreas control. We show that CBDL performs better when compared to an ensemble of BNNs baseline.
翻译:不确定性量化与应对分布迁移的鲁棒性是机器学习和人工智能领域的重要目标。尽管贝叶斯神经网络(BNN)能够评估预测中的不确定性,但不同来源的不确定性难以区分。我们提出可信贝叶斯深度学习(CBDL)。直观而言,CBDL仅需有限数量的元素即可训练一个(不可数)无限BNN集成。这一突破得益于先验与似然有限生成可信集(FGCSs,源自不精确概率文献的概念)。本质上,有限个先验-似然对的凸组合能够表征无限多个此类对。训练后,CBDL输出神经网络参数的后验集。在推理阶段,该后验集用于推导预测分布集,进而区分和量化偶然不确定性与认知不确定性。该预测集还可产生:(i)具有理想概率保证的输出集合,或(ii)被认为最优的单一输出(即具备最高预测下概率——另一个不精确概率概念)。相比单一BNN,CBDL对先验与似然设定偏差及分布迁移具有更强的鲁棒性。实验表明,CBDL在量化与解耦不同类型不确定性方面优于单一BNN、BNN集成及贝叶斯模型平均。我们进一步将CBDL应用于两个案例研究以展示其下游任务能力:其一为自动驾驶场景中的运动预测,其二为人工胰腺控制中的血糖与胰岛素动力学建模。结果表明,与BNN集成基线相比,CBDL表现更优。