Gaussian Process (GP) models provide a flexible framework for prediction and uncertainty quantification. For most covariance functions, however, exact GP prediction with $n$ points scales as $\mathcal{O}(n^3)$, making it prohibitively expensive for large datasets or large numbers of prediction points. While nearest neighbor-based prediction can work well in certain settings, non-pathological circumstances (for example measurement noise) can severely restrict its efficiency. This work presents a complementary approach where one conditions on carefully designed linear combinations of data, which is particularly effective in the setting of predicting many values in large connected regions of the data domain. For kernel functions that are smooth away from the origin, conditioning on a small number $r$ of such data contrasts can be machine-precision accurate for the full exact conditional distributions. These contrasts cost $\mathcal{O}(T r^2)$ work to compute where $T$ is the cost of solving a linear system with the data covariance matrix, and so in many cases can be computed in linear or near-linear cost by exploiting rank structure in well-behaved covariance matrices. At the cost of $\mathcal{O}(nr^2)$ additional precomputation work, this approach can also provide predictions at arbitrary points of a designated region in $\mathcal{O}(1)$ online work, making it particularly attractive for problems where prediction points are not known in advance.
翻译:暂无翻译