Analysis of geospatial data has traditionally been model-based, with a mean model, customarily specified as a linear regression on the covariates, and a covariance model, encoding the spatial dependence. We relax the strong assumption of linearity and propose embedding neural networks directly within the traditional geostatistical models to accommodate non-linear mean functions while retaining all other advantages including use of Gaussian Processes to explicitly model the spatial covariance, enabling inference on the covariate effect through the mean and on the spatial dependence through the covariance, and offering predictions at new locations via kriging. We propose NN-GLS, a new neural network estimation algorithm for the non-linear mean in GP models that explicitly accounts for the spatial covariance through generalized least squares (GLS), the same loss used in the linear case. We show that NN-GLS admits a representation as a special type of graph neural network (GNN). This connection facilitates use of standard neural network computational techniques for irregular geospatial data, enabling novel and scalable mini-batching, backpropagation, and kriging schemes. Theoretically, we show that NN-GLS will be consistent for irregularly observed spatially correlated data processes. To our knowledge this is the first asymptotic consistency result for any neural network algorithm for spatial data. We demonstrate the methodology through simulated and real datasets.
翻译:地理空间数据分析传统上采用基于模型的方法,包括均值模型(通常指定为协变量的线性回归)和协方差模型(用于编码空间依赖性)。我们放宽了线性的强假设,提出将神经网络直接嵌入传统的地统计学模型中,以容纳非线性均值函数,同时保留其他所有优势,包括使用高斯过程明确建模空间协方差,从而通过均值推断协变量效应、通过协方差推断空间依赖性,并利用克里金法在新位置进行预测。我们提出了NN-GLS,这是一种针对高斯过程模型中非线性均值的新型神经网络估计算法,它通过广义最小二乘法(GLS)显式处理空间协方差,与线性情况下使用的损失函数相同。我们证明NN-GLS可表示为一种特殊类型的图神经网络(GNN)。这一关联有助于利用标准神经网络计算技术处理不规则地理空间数据,实现新颖且可扩展的小批量处理、反向传播及克里金方案。理论方面,我们证明NN-GLS对不规则观测的空间相关数据过程具有一致性。据我们所知,这是首个针对空间数据的神经网络算法渐近一致性结果。我们通过模拟数据集和真实数据集验证了该方法。