Among generalized additive models, additive Mat\'ern Gaussian Processes (GPs) are one of the most popular for scalable high-dimensional problems. Thanks to their additive structure and stochastic differential equation representation, back-fitting-based algorithms can reduce the time complexity of computing the posterior mean from $O(n^3)$ to $O(n\log n)$ time where $n$ is the data size. However, generalizing these algorithms to efficiently compute the posterior variance and maximum log-likelihood remains an open problem. In this study, we demonstrate that for Additive Mat\'ern GPs, not only the posterior mean, but also the posterior variance, log-likelihood, and gradient of these three functions can be represented by formulas involving only sparse matrices and sparse vectors. We show how to use these sparse formulas to generalize back-fitting-based algorithms to efficiently compute the posterior mean, posterior variance, log-likelihood, and gradient of these three functions for additive GPs, all in $O(n \log n)$ time. We apply our algorithms to Bayesian optimization and propose efficient algorithms for posterior updates, hyperparameters learning, and computations of the acquisition function and its gradient in Bayesian optimization. Given the posterior, our algorithms significantly reduce the time complexity of computing the acquisition function and its gradient from $O(n^2)$ to $O(\log n)$ for general learning rate, and even to $O(1)$ for small learning rate.
翻译:在广义加性模型中,加法Matérn高斯过程(GPs)是可扩展高维问题中最流行的模型之一。借助其加法结构和随机微分方程表示,基于回填的算法可将后验均值的计算时间复杂度从$O(n^3)$降低至$O(n\log n)$(其中$n$为数据规模)。然而,如何将这些算法推广至高效计算后验方差与最大对数似然仍是一个未解决问题。本研究表明,对于加法Matérn高斯过程,不仅后验均值,后验方差、对数似然以及这三个函数的梯度均可由仅含稀疏矩阵与稀疏向量的公式表示。我们展示了如何利用这些稀疏公式推广基于回填的算法,以$O(n\log n)$时间复杂度高效计算加法高斯过程的后验均值、后验方差、对数似然及其梯度。我们将算法应用于贝叶斯优化,提出了后验更新、超参数学习以及贝叶斯优化中采集函数及其梯度计算的高效方法。在后验已知条件下,我们的算法将采集函数及其梯度的计算时间复杂度从$O(n^2)$降低至$O(\log n)$(通用学习率),当学习率较小时甚至可降至$O(1)$。