When estimating causal effects, it is important to assess external validity, i.e., determine how useful a given study is to inform a practical question for a specific target population. One challenge is that the covariate distribution in the population underlying a study may be different from that in the target population. If some covariates are effect modifiers, the average treatment effect (ATE) may not generalize to the target population. To tackle this problem, we propose new methods to generalize or transport the ATE from a source population to a target population, in the case where the source and target populations have different sets of covariates. When the ATE in the target population is identified, we propose new doubly robust estimators and establish their rates of convergence and limiting distributions. Under regularity conditions, the doubly robust estimators provably achieve the efficiency bound and are locally asymptotic minimax optimal. A sensitivity analysis is provided when the identification assumptions fail. Simulation studies show the advantages of the proposed doubly robust estimator over simple plug-in estimators. Importantly, we also provide minimax lower bounds and higher-order estimators of the target functionals. The proposed methods are applied in transporting causal effects of dietary intake on adverse pregnancy outcomes from an observational study to the whole U.S. female population.
翻译:在估计因果效应时,评估外部有效性至关重要,即确定某项研究对特定目标人群的实际问题具有多大参考价值。其中一个挑战在于:原始研究人群的协变量分布可能与目标人群不同。若某些协变量具有效应修饰作用,则平均处理效应(ATE)可能无法推广至目标人群。针对该问题,当源人群与目标人群的协变量集合存在差异时,我们提出新的方法将ATE从源人群泛化或迁移至目标人群。在目标人群ATE可识别的情况下,我们提出新的双重稳健估计量,并建立其收敛速率与极限分布性质。在正则条件下,该双重稳健估计量可证明达到效率界,且具有局部渐近最小最大最优性。当识别假设不成立时,我们提供敏感性分析。仿真研究表明,所提出的双重稳健估计量优于简单插补估计量。更重要的是,我们还为目标泛函提供了最小最大下界及高阶估计量。所提方法被应用于将膳食摄入对不良妊娠结局的因果效应从一项观察性研究迁移至全美女性人群。