Unmeasured confounding can render identification strategies based on adjustment functionals invalid. We study the "Napkin graph", a causal structure that encapsulates patterns of M-bias, instrumental variables, and the classical back-door and front-door models within a single graphical framework, yet requires a nonstandard identification strategy: the average treatment effect is expressed as a ratio of two g-formulas. We develop novel estimators for this functional, including doubly robust one-step and targeted minimum loss-based estimators that remain asymptotically linear when nuisance functions are estimated at slower-than-parametric rates using machine learning. We also show how a generalized independence restriction encoded by the Napkin graph, known as a Verma constraint, can be exploited to improve efficiency, illustrating more generally how such constraints in hidden variable DAGs can inform semiparametric inference. The proposed methods are validated through simulations and applied to the Finnish Life Course study to estimate the effect of educational attainment on income. An accompanying R package, napkincausal, implements all proposed procedures.
翻译:未测量的混杂因素可能导致基于调整泛函的识别策略失效。本文研究“餐巾图”——一种将M偏倚、工具变量、经典后门与前门模型的模式统一于单一图框架中的因果结构,但其需要一种非标准的识别策略:平均处理效应被表达为两个g-公式的比值。我们针对该泛函开发了新颖的估计量,包括双重稳健的一步估计量和基于目标最小损失的估计量,这些估计量在利用机器学习以慢于参数化速率估计干扰函数时仍保持渐近线性。我们还展示了如何利用餐巾图所编码的广义独立性约束(称为Verma约束)来提升估计效率,从而更一般地说明了隐变量有向无环图中的此类约束如何指导半参数推断。所提出的方法通过模拟实验得到验证,并应用于芬兰生命历程研究以估计教育程度对收入的影响。随附的R软件包napkincausal实现了所有提出的程序。