The rising interest in Bayesian deep learning (BDL) has led to a plethora of methods for estimating the posterior distribution. However, efficient computation of inferences, such as predictions, has been largely overlooked with Monte Carlo integration remaining the standard. In this work we examine streamlining prediction in BDL through a single forward pass without sampling. For this we use local linearisation on activation functions and local Gaussian approximations at linear layers. Thus allowing us to analytically compute an approximation to the posterior predictive distribution. We showcase our approach for both MLP and transformers, such as ViT and GPT-2, and assess its performance on regression and classification tasks.
翻译:随着贝叶斯深度学习(BDL)研究兴趣的日益增长,后验分布估计方法层出不穷。然而,诸如预测等推断任务的高效计算却长期被忽视,蒙特卡洛积分仍是当前标准方法。本研究探索通过单次前向传播(无需采样)来简化BDL中的预测过程。为此,我们在激活函数处采用局部线性化方法,在线性层使用局部高斯近似。这使得我们能够解析地计算后验预测分布的近似解。我们在MLP和Transformer架构(如ViT和GPT-2)上验证了该方法的有效性,并在回归与分类任务中评估其性能表现。