The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks. It is theoretically compelling since it can be seen as a Gaussian process posterior with the mean function given by the neural network's maximum-a-posteriori predictive function and the covariance function induced by the empirical neural tangent kernel. However, while its efficacy has been studied in large-scale tasks like image classification, it has not been studied in sequential decision-making problems like Bayesian optimization where Gaussian processes -- with simple mean functions and kernels such as the radial basis function -- are the de-facto surrogate models. In this work, we study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility. However, we also present some pitfalls that might arise and a potential problem with the LLA when the search space is unbounded.
翻译:线性拉普拉斯近似(LLA)已被证明在构建贝叶斯神经网络中既有效又高效。它在理论上具有说服力,因为可以将其视为高斯过程后验,其中均值函数由神经网络的最大后验预测函数给出,协方差函数由经验神经正切核诱导产生。然而,尽管其有效性已在图像分类等大规模任务中得到研究,但在诸如贝叶斯优化这类序贯决策问题中尚未被探索——在这些问题中,高斯过程(采用简单均值函数和如径向基函数等核函数)是事实上的代理模型。本文研究了LLA在贝叶斯优化中的实用性,突出了其强大的性能和灵活性。但我们也指出了可能出现的若干陷阱,以及当搜索空间无界时LLA存在的潜在问题。