Gaussian Processes have become an indispensable part of the spatial statistician's toolbox but are unsuitable for analyzing large dataset because of the significant time and memory needed to fit the associated model exactly. Vecchia Approximation is widely used to reduce the computational complexity and can be calculated with embarrassingly parallel algorithms. While multi-core software has been developed for Vecchia Approximation, such as the GpGp R package, software designed to run on graphics processing units (GPU) is lacking, despite the tremendous success GPUs have had in statistics and machine learning. We compare three different ways to implement Vecchia Approximation on a GPU: two of which are similar to methods used for other Gaussian Process approximations and one that is new. The impact of memory type on performance is investigated and the final method is optimized accordingly. We show that our new method outperforms the other two and then present it in the GpGpU R package. We compare GpGpU to existing multi-core and GPU-accelerated software by fitting Gaussian Process models on various datasets, including a large spatial-temporal dataset of $n>10^6$ points collected from an earth-observing satellite. Our results show that GpGpU achieves faster runtimes and better predictive accuracy.
翻译:高斯过程已成为空间统计学家工具箱中不可或缺的组成部分,但由于精确拟合相关模型所需的时间和内存成本过高,使其不适用于分析大规模数据集。Vecchia近似被广泛用于降低计算复杂度,且可通过高度并行化的算法进行计算。尽管已开发出用于Vecchia近似的多核软件(如GpGp R包),但专门设计用于图形处理器(GPU)运行的软件仍然匮乏,尽管GPU在统计学和机器学习领域已取得巨大成功。我们比较了在GPU上实现Vecchia近似的三种不同方法:其中两种类似于其他高斯过程近似所采用的方法,另一种则是新提出的方法。我们研究了内存类型对性能的影响,并据此对最终方法进行了优化。实验表明,我们的新方法优于其他两种方法,随后我们将其集成至GpGpU R包中。通过在不同数据集(包括从地球观测卫星采集的$n>10^6$个点的大规模时空数据集)上拟合高斯过程模型,我们将GpGpU与现有的多核及GPU加速软件进行比较。结果表明,GpGpU在运行速度和预测精度方面均表现更优。