Semiparametric inference on average causal effects from observational data is based on assumptions yielding identification of the effects. In practice, several distinct identifying assumptions may be plausible; an analyst has to make a delicate choice between these models. In this paper, we study three identifying assumptions based on the potential outcome framework: the back-door assumption, which uses pre-treatment covariates, the front-door assumption, which uses mediators, and the two-door assumption using pre-treatment covariates and mediators simultaneously. We provide the efficient influence functions and the corresponding semiparametric efficiency bounds that hold under these assumptions, and their combinations. We demonstrate that neither of the identification models provides uniformly the most efficient estimation and give conditions under which some bounds are lower than others. We show when semiparametric estimating equation estimators based on influence functions attain the bounds, and study the robustness of the estimators to misspecification of the nuisance models. The theory is complemented with simulation experiments on the finite sample behavior of the estimators. The results obtained are relevant for an analyst facing a choice between several plausible identifying assumptions and corresponding estimators. Our results show that this choice implies a trade-off between efficiency and robustness to misspecification of the nuisance models.
翻译:基于观察数据对平均因果效应进行半参数推断依赖于能够识别效应的假设。在实践中,可能存在几种不同的合理识别假设;分析者需在这些模型间做出审慎选择。本文基于潜在结果框架研究了三种识别假设:利用预处理协变量的后门假设、利用中介变量的前门假设,以及同时利用预处理协变量和中介变量的双门假设。我们给出了在这些假设及其组合下有效的有效影响函数和相应的半参数效率界,证明了没有任何一种识别模型能提供统一最有效的估计,并给出了某些界限低于其他界限的条件。我们展示了基于影响函数的半参数估计方程估计量何时能达到这些界限,并研究了估计量对干扰模型错误设定的稳健性。理论分析辅以关于估计量有限样本行为的模拟实验。研究结果对面临多种合理识别假设及相应估计量选择的分析者具有参考价值。结果表明,这一选择意味着在效率与干扰模型错误设定的稳健性之间存在权衡。