This paper addresses the asymptotic performance of popular spatial regression estimators on the task of estimating the linear effect of an exposure on an outcome under "spatial confounding" -- the presence of an unmeasured spatially-structured variable influencing both the exposure and the outcome. The existing literature on spatial confounding is informal and inconsistent; this paper is an attempt to bring clarity through rigorous results on the asymptotic bias and consistency of estimators from popular spatial regression models. We consider two data generation processes: one where the confounder is a fixed function of space and one where it is a random function (i.e., a stochastic process on the spatial domain). We first show that the estimators from ordinary least squares (OLS) and restricted spatial regression are asymptotically biased under spatial confounding. We then prove a novel main result on the consistency of the generalized least squares (GLS) estimator using a Gaussian process (GP) covariance matrix in the presence of spatial confounding under in-fill (fixed domain) asymptotics. The result holds under very general conditions -- for any exposure with some non-spatial variation (noise), for any spatially continuous confounder, using any choice of Mat\'ern or square exponential Gaussian process covariance used to construct the GLS estimator, and without requiring Gaussianity of errors. Finally, we prove that spatial estimators from GLS, GP regression, and spline models that are consistent under confounding by a fixed function will also be consistent under confounding by a random function. We conclude that, contrary to much of the literature on spatial confounding, traditional spatial estimators are capable of estimating linear exposure effects under spatial confounding in the presence of some noise in the exposure. We support our theoretical arguments with simulation studies.
翻译:本文分析了在“空间共线性”——即存在未观测的空间结构变量同时影响暴露和结局——背景下,常用空间回归估计量在估计暴露对结局线性效应时的渐近性能。现有空间共线性文献存在非正式性和不一致性;本文旨在通过严格分析常用空间回归模型估计量的渐近偏倚和一致性来厘清相关问题。我们考虑两种数据生成过程:一是混杂变量为空间的固定函数,二是为随机函数(即空间域上的随机过程)。首先证明普通最小二乘(OLS)和限制性空间回归估计量在空间共线性下存在渐近偏倚。随后提出关键新结论:在填充(固定域)渐近框架下,使用高斯过程(GP)协方差矩阵的广义最小二乘(GLS)估计量在空间共线性下具有一致性。该结论在极一般条件下成立——适用于含非空间变异(噪声)的任意暴露、任意空间连续混杂变量、任意用于构建GLS估计量的Matérn或平方指数高斯过程协方差函数,且无需误差项满足高斯性假设。最后证明,由GLS、GP回归和样条模型导出的空间估计量,若在固定函数混杂下具有一致性,则在随机函数混杂下仍保持一致性。我们得出结论:与多数空间共线性文献观点相反,当暴露变量存在一定噪声时,传统空间估计量能够在空间共线性中有效估计线性暴露效应。通过模拟研究验证了理论推导的正确性。