The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks. It is theoretically compelling since it can be seen as a Gaussian process posterior with the mean function given by the neural network's maximum-a-posteriori predictive function and the covariance function induced by the empirical neural tangent kernel. However, while its efficacy has been studied in large-scale tasks like image classification, it has not been studied in sequential decision-making problems like Bayesian optimization where Gaussian processes -- with simple mean functions and kernels such as the radial basis function -- are the de-facto surrogate models. In this work, we study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility. However, we also present some pitfalls that might arise and a potential problem with the LLA when the search space is unbounded.
翻译:线性拉普拉斯近似(LLA)已被证明在构建贝叶斯神经网络时既有效又高效。它在理论上具有说服力,因为其可视为高斯过程后验,其中均值函数由神经网络的最大后验预测函数给出,协方差函数则由经验神经正切核诱导。然而,尽管其在大规模任务(如图像分类)中的有效性已得到研究,但在贝叶斯优化这类序贯决策问题中尚未被探讨——而在此类问题中,采用简单均值函数和径向基函数等核的高斯过程是事实上的代理模型。本研究探讨了LLA在贝叶斯优化中的实用性,强调了其强大的性能与灵活性。但我们也揭示了可能出现的陷阱,以及当搜索空间无界时LLA存在的潜在问题。