Analysis of geospatial data has traditionally been model-based, with a mean model, customarily specified as a linear regression on the covariates, and a covariance model, encoding the spatial dependence. We relax the strong assumption of linearity and propose embedding neural networks directly within the traditional geostatistical models to accommodate non-linear mean functions while retaining all other advantages including use of Gaussian Processes to explicitly model the spatial covariance, enabling inference on the covariate effect through the mean and on the spatial dependence through the covariance, and offering predictions at new locations via kriging. We propose NN-GLS, a new neural network estimation algorithm for the non-linear mean in GP models that explicitly accounts for the spatial covariance through generalized least squares (GLS), the same loss used in the linear case. We show that NN-GLS admits a representation as a special type of graph neural network (GNN). This connection facilitates use of standard neural network computational techniques for irregular geospatial data, enabling novel and scalable mini-batching, backpropagation, and kriging schemes. Theoretically, we show that NN-GLS will be consistent for irregularly observed spatially correlated data processes. We also provide a finite sample concentration rate, which quantifies the need to accurately model the spatial covariance in neural networks for dependent data. To our knowledge, these are the first large-sample results for any neural network algorithm for irregular spatial data. We demonstrate the methodology through simulated and real datasets.
翻译:传统上,空间地理数据的分析采用基于模型的方法,包括均值模型(通常指定为协变量的线性回归)和协方差模型(用于编码空间依赖性)。我们放宽了线性假设,提出将神经网络直接嵌入传统地统计模型,以容纳非线性均值函数,同时保留所有其他优势,包括使用高斯过程显式建模空间协方差、通过均值推断协变量效应、通过协方差推断空间依赖性,以及通过克里金法提供新位置的预测。我们提出了NN-GLS,一种用于高斯过程模型中非线性均值函数的新型神经网络估计算法,该算法通过广义最小二乘法显式考虑空间协方差(与线性情况使用的损失函数相同)。我们证明NN-GLS可表示为一种特殊类型的图神经网络。这种关联促进了标准神经网络计算技术在非规则空间地理数据中的应用,实现了新颖且可扩展的小批量处理、反向传播和克里金方案。理论上,我们证明NN-GLS对于非规则观测的空间相关数据过程具有一致性。我们还提供了有限样本的集中速率,该速率量化了在依赖数据的神经网络中准确建模空间协方差的必要性。据我们所知,这是针对非规则空间数据的任何神经网络算法的首次大样本结果。我们通过模拟和真实数据集展示了该方法。