This paper focuses on statistical modelling using additive Gaussian process (GP) models and their efficient implementation for large-scale spatio-temporal data with a multi-dimensional grid structure. To achieve this, we exploit the Kronecker product structures of the covariance kernel. While this method has gained popularity in the GP literature, the existing approach is limited to covariance kernels with a tensor product structure and does not allow flexible modelling and selection of interaction effects. This is considered an important component in spatio-temporal analysis. We extend the method to a more general class of additive GP models that accounts for main effects and selected interaction effects. Our approach allows for easy identification and interpretation of interaction effects. The proposed model is applied to the analysis of NO$_2$ concentrations during the COVID-19 lockdown in London. Our scalable method enables analysis of large-scale, hourly-recorded data collected from 59 different stations across the city, providing additional insights to findings from previous research using daily or weekly averaged data.
翻译:本文聚焦于利用加性高斯过程(GP)模型进行统计建模,并针对具有多维网格结构的大规模时空数据实现其高效计算。为此,我们利用了协方差核的Kronecker积结构。尽管该方法在GP文献中已获得广泛关注,但现有方法局限于具有张量积结构的协方差核,无法灵活建模与选择交互效应——这被视为时空分析中的重要组成部分。我们将该方法扩展至更一般的加性GP模型类别,该模型可同时表征主效应与特定交互效应。我们的方法能够便捷地识别与解释交互效应。所提出的模型应用于伦敦COVID-19封锁期间的NO$_2$浓度分析。这种可扩展方法使我们能够分析全市59个不同监测站采集的大规模小时记录数据,为先前基于日或周平均数据的研究结论提供了新见解。