Covariate adjustment is a ubiquitous method used to estimate the average treatment effect (ATE) from observational data. Assuming a known graphical structure of the data generating model, recent results give graphical criteria for optimal adjustment, which enables efficient estimation of the ATE. However, graphical approaches are challenging for high-dimensional and complex data, and it is not straightforward to specify a meaningful graphical model of non-Euclidean data such as texts. We propose an general framework that accommodates adjustment for any subset of information expressed by the covariates. We generalize prior works and leverage these results to identify the optimal covariate information for efficient adjustment. This information is minimally sufficient for prediction of the outcome conditionally on treatment. Based on our theoretical results, we propose the Debiased Outcome-adapted Propensity Estimator (DOPE) for efficient estimation of the ATE, and we provide asymptotic results for the DOPE under general conditions. Compared to the augmented inverse propensity weighted (AIPW) estimator, the DOPE can retain its efficiency even when the covariates are highly predictive of treatment. We illustrate this with a single-index model, and with an implementation of the DOPE based on neural networks, we demonstrate its performance on simulated and real data. Our results show that the DOPE provides an efficient and robust methodology for ATE estimation in various observational settings.
翻译:协变量调整是一种在观测数据中估计平均处理效应(ATE)时广泛使用的方法。假设数据生成模型的图结构已知,近期研究给出了最优调整的图准则,从而能够高效估计ATE。然而,图方法在处理高维复杂数据时面临挑战,且针对文本等非欧几里得数据指定有意义的图模型并非易事。我们提出一个通用框架,支持对协变量所表达信息的任意子集进行调整。我们推广了先前的研究成果,并利用这些结果识别出用于高效调整的最优协变量信息。该信息在给定处理条件下对结果预测是充分必要的最小信息集。基于理论结果,我们提出了用于ATE高效估计的偏差修正结果适配倾向性估计器(DOPE),并在一般条件下给出了DOPE的渐近性质。与增广逆概率加权(AIPW)估计器相比,即使协变量对处理具有高度预测性,DOPE仍能保持其效率。我们通过单指数模型进行了验证,并基于神经网络实现了DOPE,在模拟数据和真实数据上展示了其性能。结果表明,DOPE为各种观测场景下的ATE估计提供了高效且稳健的方法论。