Accurate spatial prediction and rigorous uncertainty quantification are central to modern spatial epidemiology and environmental risk analysis. We introduce a statistically principled hybrid modelling framework that integrates the nonlinear, attention-based representation learning capabilities of a dynamic Graph Attention Network (GATv2) with a latent Gaussian spatial process from model-based geostatistics (MBG). This framework jointly captures relational dependence encoded in graph structures and continuous spatial dependence governed by physical proximity. We evaluate the proposed model via a controlled simulation study and an applied analysis of malaria prevalence data, comparing its predictive accuracy, calibration, and uncertainty quantification against classical geostatistical models and standalone GATv2 architectures. Our analyses show that GATv2 captures complex nonlinear interactions but fails to account for residual spatial autocorrelation, resulting in miscalibrated predictive distributions. Conversely, geostatistical models provide coherent uncertainty quantification through structured covariance functions yet are constrained by linear predictor assumptions and by their reliance on Euclidean distance to encode spatial structure. By integrating attention mechanisms and nonlinear features with an explicit probabilistic spatial random field, the hybrid model captured the relational dependence, consistently improved predictive accuracy, and provided more realistic uncertainty quantification in both simulation and applied settings. Overall, the findings demonstrate that the hybrid model constitutes a statistically coherent and empirically robust framework for modelling complex spatial and spatio-temporal processes in settings where both distance-based and structure-based dependencies operate.
翻译:精确的空间预测与严谨的不确定性量化是现代空间流行病学与环境风险分析的核心。本文提出了一种基于统计原理的混合建模框架,该框架将动态图注意力网络(GATv2)的非线性、基于注意力的表征学习能力,与基于模型的地统计学(MBG)中的潜高斯空间过程相结合。该框架能同时捕捉图结构编码的关系依赖以及物理邻近性主导的连续空间依赖。我们通过一项受控模拟研究和对疟疾流行率数据的应用分析,将所提模型的预测准确性、校准能力和不确定性量化性能与经典地统计模型及独立的GATv2架构进行比较。分析表明,GATv2能捕捉复杂的非线性交互作用,但无法处理残差空间自相关,导致预测分布校准失准。反之,地统计模型通过结构化的协方差函数提供连贯的不确定性量化,但其受限于线性预测器假设以及对欧氏距离编码空间结构的依赖。通过将注意力机制和非线性特征与显式的概率空间随机场相结合,混合模型在模拟和应用场景中均能捕捉关系依赖,持续提升预测准确性,并提供更符合实际的不确定性量化。总体而言,研究结果表明,在同时存在基于距离和基于结构的依赖关系的场景中,该混合模型为复杂空间及时空过程的建模提供了一个统计上连贯且经验上稳健的框架。