Bayesian Neural Networks (BNNs) extend traditional neural networks to provide uncertainties associated with their outputs. On the forward pass through a BNN, predictions (and their uncertainties) are made either by Monte Carlo sampling network weights from the learned posterior or by analytically propagating statistical moments through the network. Though flexible, Monte Carlo sampling is computationally expensive and can be infeasible or impractical under resource constraints or for large networks. While moment propagation can ameliorate the computational costs of BNN inference, it can be difficult or impossible for networks with arbitrary nonlinearities, thereby restricting the possible set of network layers permitted with such a scheme. In this work, we demonstrate a simple yet effective approach for propagating statistical moments through arbitrary nonlinearities with only 3 deterministic samples, enabling few-sample variational inference of BNNs without restricting the set of network layers used. Furthermore, we leverage this approach to demonstrate a novel nonlinear activation function that we use to inject physics-informed prior information into output nodes of a BNN.
翻译:贝叶斯神经网络(BNNs)扩展了传统神经网络,能够为其输出提供不确定性估计。在前向传播过程中,BNNs的预测(及其不确定性)可通过两种方式实现:一是通过蒙特卡洛采样从学习到的后验分布中抽取网络权重,二是通过解析方式传播网络中的统计矩。尽管蒙特卡洛采样具有灵活性,但其计算成本高昂,在资源受限或网络规模较大时可能不可行或不实用。虽然矩传播能够降低BNN推断的计算开销,但对于具有任意非线性的网络而言,这种传播方法可能难以实现甚至不可行,从而限制了此类方案允许的网络层类型。本文提出了一种简单而有效的方法,仅需3个确定性样本即可实现任意非线性下的统计矩传播,从而在无需限制网络层类型的前提下实现BNN的少样本变分推断。此外,我们利用该方法提出了一种新型非线性激活函数,该函数能够将物理先验信息注入BNN的输出节点。