Analysis of geospatial data has traditionally been model-based, with a mean model, customarily specified as a linear regression on the covariates, and a covariance model, encoding the spatial dependence. We relax the strong assumption of linearity and propose embedding neural networks directly within the traditional geostatistical models to accommodate non-linear mean functions while retaining all other advantages including use of Gaussian Processes to explicitly model the spatial covariance, enabling inference on the covariate effect through the mean and on the spatial dependence through the covariance, and offering predictions at new locations via kriging. We propose NN-GLS, a new neural network estimation algorithm for the non-linear mean in GP models that explicitly accounts for the spatial covariance through generalized least squares (GLS), the same loss used in the linear case. We show that NN-GLS admits a representation as a special type of graph neural network (GNN). This connection facilitates use of standard neural network computational techniques for irregular geospatial data, enabling novel and scalable mini-batching, backpropagation, and kriging schemes. Theoretically, we show that NN-GLS will be consistent for irregularly observed spatially correlated data processes. To our knowledge this is the first asymptotic consistency result for any neural network algorithm for spatial data. We demonstrate the methodology through simulated and real datasets.
翻译:地理空间数据分析传统上基于模型方法,包含均值模型(通常设定为协变量的线性回归)和编码空间依赖性的协方差模型。我们放宽了线性假设的强约束,提出将神经网络直接嵌入传统地统计模型,以容纳非线性均值函数,同时保留所有其他优势,包括使用高斯过程显式建模空间协方差,通过均值实现协变量效应的推断,通过协方差实现空间依赖性的推断,并利用克里金法在新位置进行预测。我们提出NN-GLS,一种新的神经网络估计算法,用于高斯过程模型中的非线性均值,通过广义最小二乘法显式考虑空间协方差,该损失函数与线性情形下相同。我们证明NN-GLS可表示为一种特殊类型的图神经网络。这一关联促进了标准神经网络计算技术在不规则地理空间数据中的应用,实现了新型可扩展的小批量处理、反向传播和克里金法方案。理论上,我们证明NN-GLS对于不规则观测的空间相关数据过程具有一致性。据我们所知,这是首个针对空间数据神经网络算法的一致性渐进结果。我们通过模拟数据集和真实数据集展示了该方法的有效性。