We study targeted maximum likelihood estimation (TMLE) of the average treatment effect in a semiparametric regression model whose mean function is indexed by a finite-dimensional parameter, while the additive error distribution is left unspecified apart from mild regularity conditions and independence from treatment and baseline covariates. The paper addresses a genuinely new causal problem: because the target depends on both the regression parameter and the unrestricted marginal law of the covariates, the regression-efficient score must be converted into a causal efficient influence function, semiparametric efficiency bound, and targeting step for the average treatment effect itself. We derive those objects, construct a cross-fitted TMLE, and establish asymptotic linearity and efficiency. In simulations, the proposed estimator is most effective when the mean is correctly structured but the error law is heavy-tailed or skewed. In these settings, it yields smaller root mean squared error and shorter intervals than Gaussian working-model inference, a standard augmented inverse-probability-weighted estimator, Bayesian additive regression trees, and a forest-based TMLE benchmark. Misspecification experiments are included to clarify the scope of the method rather than to claim universal superiority under broad mean-model failure.
翻译:我们研究半参数回归模型中平均处理效应的目标最大似然估计(TMLE),该模型的均值函数由有限维参数索引,而加性误差分布除满足温和正则性条件及与处理和基线协变量独立外未作具体设定。本文处理了一个真正新颖的因果问题:由于目标量同时依赖于回归参数和协变量的无约束边际分布,需将回归有效得分转化为因果有效影响函数、半参数效率界以及针对平均处理效应本身的目标步骤。我们推导了这些对象,构造了交叉拟合TMLE,并建立了渐近线性性与效率。模拟结果表明,当均值结构正确设定但误差分布呈厚尾或偏态时,所提估计量最为有效。在此类情形下,与高斯工作模型推断、标准增广逆概率加权估计、贝叶斯加性回归树以及基于森林的TMLE基准相比,该估计量能实现更小的均方根误差和更窄的置信区间。文中包含设定错误实验以阐明该方法适用范围,而非宣称在广泛均值模型失效情况下的普遍优越性。