Semiparametric Efficient Empirical Higher Order Influence Function Estimators

Robins et al. (2008, 2017) applied the theory of higher order influence functions (HOIFs) to derive an estimator of the mean $\psi$ of an outcome Y in a missing data model with Y missing at random conditional on a vector X of continuous covariates; their estimator, in contrast to previous estimators, is semiparametric efficient under the minimal conditions of Robins et al. (2009b), together with an additional (non-minimal) smoothness condition on the density g of X, because the Robins et al. (2008, 2017) estimator depends on a nonparametric estimate of g. In this paper, we introduce a new HOIF estimator that has the same asymptotic properties as the original one, but does not impose any smoothness requirement on g. This is important for two reasons. First, one rarely has the knowledge about the properties of g. Second, even when g is smooth, if the dimension of X is even moderate, accurate nonparametric estimation of its density is not feasible at the sample sizes often encountered in applications. In fact, to the best of our knowledge, this new HOIF estimator remains the only semiparametric efficient estimator of $\psi$ under minimal conditions, despite the rapidly growing literature on causal effect estimation. We also show that our estimator can be generalized to the entire class of functionals considered by Robins et al. (2008) which include the average effect of a treatment on a response Y when a vector X suffices to control confounding and the expected conditional variance of a response Y given a vector X. Simulation experiments are also conducted, which demonstrate that our new estimator outperforms those of Robins et al. (2008, 2017) in finite samples, when g is not very smooth.

翻译：Robins等人（2008, 2017）应用高阶影响函数理论，针对缺失数据模型中的结果变量Y的均值ψ推导出一个估计量，其中Y在给定连续协变量向量X的条件下随机缺失。与以往的估计量不同，该估计量在Robins等人（2009b）的最小条件以及X的密度g的额外（非最小）光滑性条件下是半参数有效的，因为Robins等人（2008, 2017）的估计量依赖于g的非参数估计。本文提出一种新的高阶影响函数估计量，其渐进性质与原估计量相同，但不对g施加任何光滑性要求。这具有重要意义，原因有二：首先，人们很少具备关于g性质的知识；其次，即使g是光滑的，当X的维数中等时，在应用中常见的样本量下也难以对其密度进行精确的非参数估计。事实上，据我们所知，尽管因果效应估计的文献迅速增长，但这一新的高阶影响函数估计量仍是唯一在最小条件下对ψ保持半参数有效的估计量。我们还证明，该估计量可推广至Robins等人（2008）考虑的整个泛函类别，包括当向量X足以控制混杂时处理对响应Y的平均效应，以及给定向量X时响应Y的条件方差期望。模拟实验表明，当g不够光滑时，我们的新估计量在有限样本中优于Robins等人（2008, 2017）的估计量。