Spatial statistics often rely on Gaussian processes (GPs) to capture dependencies across locations. However, their computational cost increases rapidly with the number of locations, potentially needing multiple hours even for moderate sample sizes. To address this, we propose using Semi-Implicit Variational Inference (SIVI), a highly flexible Bayesian approximation method, for scalable Bayesian spatial interpolation. We evaluated SIVI with a GP prior and a Nearest-Neighbour Gaussian Process (NNGP) prior compared to Automatic Differentiation Variational Inference (ADVI), Pathfinder, and Hamiltonian Monte Carlo (HMC), the reference method in spatial statistics. Methods were compared based on their predictive ability measured by the CRPS, the interval score, and the negative log-predictive density across 50 replicates for both Gaussian and Poisson outcomes. SIVI-based methods achieved similar results to HMC, while being drastically faster. On average, for the Poisson scenario with 500 training locations, SIVI reduced the computational time from roughly 6 hours for HMC to 130 seconds. Furthermore, SIVI-NNGP analyzed a simulated land surface temperature dataset of 150,000 locations while estimating all unknown model parameters in under two minutes. These results highlight the potential of SIVI as a flexible and scalable inference technique in spatial statistics.
翻译:空间统计学常依赖高斯过程(GPs)来捕捉空间位置间的依赖关系。然而,其计算成本随位置数量急剧增加,即使对于中等规模样本也可能需要数小时计算时间。为此,我们提出采用半隐式变分推断(SIVI)——一种高度灵活的贝叶斯近似方法——来实现可扩展的贝叶斯空间插值。我们评估了采用GP先验和最近邻高斯过程(NNGP)先验的SIVI方法,并与自动微分变分推断(ADVI)、Pathfinder算法以及空间统计学中的基准方法哈密顿蒙特卡洛(HMC)进行了比较。通过基于连续分级概率评分(CRPS)、区间评分和负对数预测密度的预测能力指标,在50次重复实验中针对高斯分布与泊松分布结果对各方法进行了评估。基于SIVI的方法取得了与HMC相近的结果,同时计算速度显著提升。平均而言,在包含500个训练位置的泊松场景中,SIVI将计算时间从HMC所需的约6小时缩短至130秒。此外,SIVI-NNGP方法在不到两分钟内完成了对包含15万个位置的模拟地表温度数据集的分析,并同步估计了所有未知模型参数。这些结果凸显了SIVI作为空间统计学中灵活且可扩展的推断技术的潜力。