Spatial confounding is a fundamental issue in spatial regression models which arises because spatial random effects, included to approximate unmeasured spatial variation, are typically not independent of covariates in the model. This can lead to significant bias in covariate effect estimates. The problem is complex and has been the topic of extensive research with sometimes puzzling and seemingly contradictory results. Here, we develop a broad theoretical framework that brings mathematical clarity to the mechanisms of spatial confounding, providing explicit analytical expressions for the resulting bias. We see that the problem is directly linked to spatial smoothing and identify exactly how the size and occurrence of bias relate to the features of the spatial model as well as the underlying confounding scenario. Using our results, we can explain subtle and counter-intuitive behaviours. Finally, we propose a general approach for dealing with spatial confounding bias in practice, applicable for any spatial model specification. When a covariate has non-spatial information, we show that a general form of the so-called spatial+ method can be used to eliminate bias. When no such information is present, the situation is more challenging but, under the assumption of unconfounded high frequencies, we develop a procedure in which multiple capped versions of spatial+ are applied to assess the bias in this case. We illustrate our approach with an application to air temperature in Germany.
翻译:空间混杂是空间回归模型中的一个基本问题,其产生原因是用于近似未测量空间变异的空间随机效应通常与模型中的协变量不独立。这可能导致协变量效应估计出现显著偏差。该问题具有复杂性,已成为广泛研究的主题,有时会产生令人困惑且看似矛盾的结果。本文构建了一个广义理论框架,从数学角度阐明了空间混杂的作用机制,并给出了所产生偏差的显式解析表达式。我们发现该问题与空间平滑直接相关,并精确识别了偏差的大小和出现条件如何与空间模型特征以及潜在混杂场景相关联。利用我们的研究结果,可以解释那些微妙且反直觉的现象。最后,我们提出了一种处理实践中空间混杂偏差的通用方法,适用于任何空间模型设定。当协变量包含非空间信息时,我们证明可采用所谓空间增强方法的通用形式来消除偏差。当不存在此类信息时,情况更具挑战性;但在无混杂高频假设下,我们开发了一种通过应用多个截断版本的空间增强方法来评估此类偏差的程序。我们以德国气温数据为例展示了该方法的应用。