The rising interest in Bayesian deep learning (BDL) has led to a plethora of methods for estimating the posterior distribution. However, efficient computation of inferences, such as predictions, has been largely overlooked with Monte Carlo integration remaining the standard. In this work we examine streamlining prediction in BDL through a single forward pass without sampling. For this we use local linearisation on activation functions and local Gaussian approximations at linear layers. Thus allowing us to analytically compute an approximation to the posterior predictive distribution. We showcase our approach for both MLP and transformers, such as ViT and GPT-2, and assess its performance on regression and classification tasks.
翻译:随着贝叶斯深度学习(BDL)关注度的提升,已涌现出大量用于估计后验分布的方法。然而,推理(如预测)的高效计算在很大程度上被忽视,蒙特卡洛积分仍是标准方法。本文研究通过单次前向传播(无需采样)实现BDL中的预测流线化。为此,我们在激活函数上采用局部线性化,在线性层采用局部高斯近似。这使得我们能够解析地计算后验预测分布的近似值。我们在MLP和Transformer(如ViT和GPT-2)上展示了该方法,并在回归和分类任务中评估了其性能。