Large-scale linear models are ubiquitous throughout machine learning, with contemporary application as surrogate models for neural network uncertainty quantification; that is, the linearised Laplace method. Alas, the computational cost associated with Bayesian linear models constrains this method's application to small networks, small output spaces and small datasets. We address this limitation by introducing a scalable sample-based Bayesian inference method for conjugate Gaussian multi-output linear models, together with a matching method for hyperparameter (regularisation) selection. Furthermore, we use a classic feature normalisation method (the g-prior) to resolve a previously highlighted pathology of the linearised Laplace method. Together, these contributions allow us to perform linearised neural network inference with ResNet-18 on CIFAR100 (11M parameters, 100 output dimensions x 50k datapoints) and with a U-Net on a high-resolution tomographic reconstruction task (2M parameters, 251k output dimensions).
翻译:大规模线性模型在机器学习中无处不在,当前被用作神经网络不确定性量化的替代模型,即线性化拉普拉斯方法。然而,贝叶斯线性模型相关的计算成本限制了该方法在小规模网络、小规模输出空间和小规模数据集上的应用。我们通过引入一种可扩展的基于采样的共轭高斯多输出线性模型贝叶斯推断方法,以及匹配的超参数(正则化)选择方法,解决了这一限制。此外,我们使用经典的特征归一化方法(g-先验)来消除线性化拉普拉斯方法中先前被指出的病态问题。这些贡献共同使我们能够在CIFAR100数据集上使用ResNet-18(1100万个参数,100个输出维度×5万个数据点)进行线性化神经网络推断,并在高分辨率断层重建任务中使用U-Net(200万个参数,25.1万个输出维度)进行类似操作。