Neural networks (NNs) are primarily developed within the frequentist statistical framework. Nevertheless, frequentist NNs lack the capability to provide uncertainties in the predictions, and hence their robustness can not be adequately assessed. Conversely, the Bayesian neural networks (BNNs) naturally offer predictive uncertainty by applying Bayes' theorem. However, their computational requirements pose significant challenges. Moreover, both frequentist NNs and BNNs suffer from overfitting issues when dealing with noisy and sparse data, which render their predictions unwieldy away from the available data space. To address both these problems simultaneously, we leverage insights from a hierarchical setting in which the parameter priors are conditional on hyperparameters to construct a BNN by applying a semi-analytical framework known as nonlinear sparse Bayesian learning (NSBL). We call our network sparse Bayesian neural network (SBNN) which aims to address the practical and computational issues associated with BNNs. Simultaneously, imposing a sparsity-inducing prior encourages the automatic pruning of redundant parameters based on the automatic relevance determination (ARD) concept. This process involves removing redundant parameters by optimally selecting the precision of the parameters prior probability density functions (pdfs), resulting in a tractable treatment for overfitting. To demonstrate the benefits of the SBNN algorithm, the study presents an illustrative regression problem and compares the results of a BNN using standard Bayesian inference, hierarchical Bayesian inference, and a BNN equipped with the proposed algorithm. Subsequently, we demonstrate the importance of considering the full parameter posterior by comparing the results with those obtained using the Laplace approximation with and without NSBL.
翻译:神经网络(NNs)主要在频率统计框架下发展。然而,频率派神经网络无法提供预测中的不确定性,因此其鲁棒性无法得到充分评估。相反,贝叶斯神经网络(BNNs)通过应用贝叶斯定理自然地提供预测不确定性,但其计算需求带来了显著挑战。此外,当处理含噪声和稀疏数据时,频率派神经网络和BNNs均面临过拟合问题,导致在远离可用数据空间时预测效果不佳。为同时解决这两个问题,我们利用层次化设置中的见解(其中参数先验以超参数为条件),通过应用称为非线性稀疏贝叶斯学习(NSBL)的半解析框架构建BNN。我们将所提出的网络称为稀疏贝叶斯神经网络(SBNN),旨在解决与BNNs相关的实际和计算问题。同时,施加稀疏诱导先验基于自动相关性确定(ARD)概念,通过最优选择参数先验概率密度函数(pdfs)的精度自动修剪冗余参数,从而为过拟合提供可处理的解决方案。为展示SBNN算法的优势,本研究阐述了一个示例性回归问题,并比较了使用标准贝叶斯推断、层次贝叶斯推断以及配备所提算法的BNN的结果。随后,通过将结果与采用Laplace近似(含与不含NSBL)所得结果进行对比,论证了考虑完整参数后验的重要性。