Large-scale linear models are ubiquitous throughout machine learning, with contemporary application as surrogate models for neural network uncertainty quantification; that is, the linearised Laplace method. Alas, the computational cost associated with Bayesian linear models constrains this method's application to small networks, small output spaces and small datasets. We address this limitation by introducing a scalable sample-based Bayesian inference method for conjugate Gaussian multi-output linear models, together with a matching method for hyperparameter (regularisation) selection. Furthermore, we use a classic feature normalisation method (the g-prior) to resolve a previously highlighted pathology of the linearised Laplace method. Together, these contributions allow us to perform linearised neural network inference with ResNet-18 on CIFAR100 (11M parameters, 100 outputs x 50k datapoints), with ResNet-50 on Imagenet (50M parameters, 1000 outputs x 1.2M datapoints) and with a U-Net on a high-resolution tomographic reconstruction task (2M parameters, 251k output~dimensions).
翻译:大规模线性模型在机器学习中普遍存在,近期被用作神经网络不确定性量化的替代模型,即线性化拉普拉斯方法。然而,贝叶斯线性模型的计算成本限制了该方法仅适用于小型网络、小规模输出空间和小型数据集。我们通过引入一种适用于共轭高斯多输出线性模型的可扩展基于样本的贝叶斯推理方法,并配套提出超参数(正则化)选择方法,从而解决了这一局限性。此外,我们采用经典特征归一化方法(g-先验)消除了先前研究中指出的线性化拉普拉斯方法的病态问题。这些贡献使我们能够对ResNet-18在CIFAR100(1100万参数、100个输出×5万数据点)、ResNet-50在ImageNet(5000万参数、1000个输出×120万数据点)以及U-Net在高分辨率断层重建任务(200万参数、25.1万输出维度)上执行线性化神经网络推理。