This manuscript unites causal inference and spatial statistics, presenting novel insights for causal inference in spatial data analysis, and drawing from tools in spatial statistics to estimate causal effects. We introduce spatial causal graphs to highlight that spatial confounding and interference can be entangled, in that investigating the presence of one can lead to wrongful conclusions in the presence of the other. Moreover, we show that spatial dependence in the exposure variable can render standard analyses invalid. To remedy these issues, we propose a Bayesian parametric approach based on tools commonly-used in spatial statistics. This approach simultaneously accounts for interference and mitigates bias from local and neighborhood unmeasured spatial confounding. From a Bayesian perspective, we show that incorporating an exposure model is necessary. Under a specific model formulation, we prove that all parameters are identifiable including the causal effects, even in the presence of unmeasured confounding. We illustrate the approach with a simulation study. We evaluate the effect of local and neighboring sulfur dioxide emissions from power plants on county-level cardiovascular mortality from observational spatial data in the United States, where unmeasured spatial confounding and interference might be present simultaneously.
翻译:本文稿将因果推断与空间统计学相结合,为空间数据分析中的因果推断提供了新的见解,并借鉴空间统计学的工具来估计因果效应。我们引入空间因果图以强调空间混杂与干扰可能相互纠缠,即在对其中一种现象进行研究时,若另一种现象同时存在,则可能导致错误结论。此外,我们表明暴露变量的空间依赖性可能使标准分析方法失效。为解决这些问题,我们基于空间统计学常用工具提出了一种贝叶斯参数化方法。该方法能同时处理干扰效应,并减轻来自局部及邻域未测空间混杂的偏倚。从贝叶斯视角出发,我们论证了纳入暴露模型的必要性。在特定模型设定下,我们证明了包括因果效应在内的所有参数皆可识别,即使在存在未测混杂的情况下亦然。我们通过模拟研究阐释了该方法。基于美国观测空间数据,我们评估了发电厂局部及邻近二氧化硫排放对县级心血管死亡率的影响,该场景中未测空间混杂与干扰可能同时存在。