Gaussian processes (GPs) are typically criticised for their unfavourable scaling in both computational and memory requirements. For large datasets, sparse GPs reduce these demands by conditioning on a small set of inducing variables designed to summarise the data. In practice however, for large datasets requiring many inducing variables, such as low-lengthscale spatial data, even sparse GPs can become computationally expensive, limited by the number of inducing variables one can use. In this work, we propose a new class of inter-domain variational GP, constructed by projecting a GP onto a set of compactly supported B-spline basis functions. The key benefit of our approach is that the compact support of the B-spline basis functions admits the use of sparse linear algebra to significantly speed up matrix operations and drastically reduce the memory footprint. This allows us to very efficiently model fast-varying spatial phenomena with tens of thousands of inducing variables, where previous approaches failed.
翻译:高斯过程(GPs)通常因其在计算和内存需求上的不利扩展性而受到批评。对于大型数据集,稀疏GP通过基于一组旨在总结数据的小型诱导变量进行条件化来减少这些需求。然而,在实践中,对于需要大量诱导变量的大型数据集(例如低长度尺度的空间数据),即使稀疏GP也可能变得计算成本高昂,受限于可使用的诱导变量数量。在这项工作中,我们提出了一类新的域间变分GP,通过将GP投影到一组紧支撑B样条基函数上构建而成。我们方法的关键优势在于,B样条基函数的紧支撑特性允许使用稀疏线性代数来显著加速矩阵运算并大幅减少内存占用。这使得我们能够非常高效地对具有数万个诱导变量的快速变化空间现象进行建模,而此前的方法无法实现。