Uncertainty quantification in neural networks through methods such as Dropout, Bayesian neural networks and Laplace approximations is either prone to underfitting or computationally demanding, rendering these approaches impractical for large-scale datasets. In this work, we address these shortcomings by shifting the focus from uncertainty in the weight space to uncertainty at the activation level, via Gaussian processes. More specifically, we introduce the Gaussian Process Activation function (GAPA) to capture neuron-level uncertainties. Our approach operates in a post-hoc manner, preserving the original mean predictions of the pre-trained neural network and thereby avoiding the underfitting issues commonly encountered in previous methods. We propose two methods. The first, GAPA-Free, employs empirical kernel learning from the training data for the hyperparameters and is highly efficient during training. The second, GAPA-Variational, learns the hyperparameters via gradient descent on the kernels, thus affording greater flexibility. Empirical results demonstrate that GAPA-Variational outperforms the Laplace approximation on most datasets in at least one of the uncertainty quantification metrics.
翻译:通过Dropout、贝叶斯神经网络和拉普拉斯近似等方法进行神经网络不确定性量化时,要么容易欠拟合,要么计算成本高昂,使得这些方法难以应用于大规模数据集。本研究通过高斯过程将关注点从权重空间的不确定性转移到激活层的不确定性,从而解决了这些缺陷。具体而言,我们提出了高斯过程激活函数(GAPA)来捕捉神经元级别的不确定性。我们的方法以事后方式运行,保留了预训练神经网络的原始均值预测,从而避免了先前方法常见的欠拟合问题。我们提出了两种方法:第一种是GAPA-Free,它通过训练数据对超参数进行经验核学习,训练效率极高;第二种是GAPA-Variational,通过核函数的梯度下降学习超参数,从而提供了更大的灵活性。实验结果表明,在大多数数据集上,GAPA-Variational在至少一项不确定性量化指标上优于拉普拉斯近似。