When estimating causal effects, it is important to assess external validity, i.e., determine how useful a given study is to inform a practical question for a specific target population. One challenge is that the covariate distribution in the population underlying a study may be different from that in the target population. If some covariates are effect modifiers, the average treatment effect (ATE) may not generalize to the target population. To tackle this problem, we propose new methods to generalize or transport the ATE from a source population to a target population, in the case where the source and target populations have different sets of covariates. When the ATE in the target population is identified, we propose new doubly robust estimators and establish their rates of convergence and limiting distributions. Under regularity conditions, the doubly robust estimators provably achieve the efficiency bound and are locally asymptotic minimax optimal. A sensitivity analysis is provided when the identification assumptions fail. Simulation studies show the advantages of the proposed doubly robust estimator over simple plug-in estimators. Importantly, we also provide minimax lower bounds and higher-order estimators of the target functionals. The proposed methods are applied in transporting causal effects of dietary intake on adverse pregnancy outcomes from an observational study to the whole U.S. female population.
翻译:在估计因果效应时,评估外部有效性至关重要,即确定特定研究结果对目标人群实际问题的参考价值。一个关键挑战在于,研究基础人群的协变量分布可能与目标人群不同。若某些协变量为效应修饰因子,则平均处理效应(ATE)可能无法泛化至目标人群。针对该问题,本文提出了在源人群与目标人群协变量集合不同的情况下,将ATE从源人群泛化或迁移至目标人群的新方法。当目标人群的ATE可识别时,我们提出了新的双重稳健估计量,并建立了其收敛速度与极限分布。在正则条件下,这些双重稳健估计量可证明达到效率边界,且具有局部渐近极小极大最优性。当识别假设失效时,我们提供了敏感性分析。仿真研究表明,所提双重稳健估计量相较于简单插件估计量具有显著优势。更重要的是,我们还给出了目标泛函的极小极大下界及高阶估计量。所提方法被应用于将饮食摄入对不良妊娠结局的因果效应从一项观察性研究迁移至全美女性人群。