We develop a unified framework for automatic debiased machine learning (autoDML) for inference on a broad class of statistical parameters. The framework applies to any smooth functional of a nonparametric M-estimand, defined as the minimizer of a population risk over an infinite-dimensional linear space. Examples include counterfactual regression, quantile, and survival functions, as well as conditional average treatment effects. Rather than requiring manual derivation of influence functions, our approach automates the construction of debiased estimators using three ingredients: the gradient and Hessian of the loss function and a linear approximation of the target functional. Estimation reduces to solving two risk minimization problems, one for the M-estimand and one for a Riesz representer. The framework accommodates Neyman-orthogonal loss functions that depend on nuisance parameters and extends to vector-valued M-estimands through joint risk minimization. We characterize the efficient influence function and construct efficient autoDML estimators via one-step correction, targeted minimum loss estimation, and sieve-based plug-in methods. Under quadratic risk, these estimators satisfy double robustness for linear functionals. We further show that they are robust to mild misspecification of the M-estimand model, incurring only second-order bias. We illustrate the method by estimating long-term survival probabilities under a semiparametric two-parameter beta-geometric failure model.
翻译:我们开发了一个统一的自动去偏机器学习(autoDML)框架,用于对一类广泛的统计参数进行推断。该框架适用于非参数M-估计量的任意平滑函数,其中M-估计量被定义为在无穷维线性空间上总体风险的最小化器。示例包括反事实回归、分位数函数、生存函数以及条件平均处理效应。我们的方法无需手动推导影响函数,而是通过三个要素自动构建去偏估计量:损失函数的梯度和海森矩阵,以及目标函数的线性近似。估计过程简化为求解两个风险最小化问题,一个针对M-估计量,另一个针对Riesz表示算子。该框架容纳依赖于 nuisance 参数的Neyman正交损失函数,并通过联合风险最小化扩展到向量值M-估计量。我们刻画了有效影响函数,并通过一步校正、目标最小损失估计和基于筛的插入法构建了有效的autoDML估计量。在二次风险下,这些估计量对线性函数满足双重稳健性。我们进一步证明,它们对M-估计量模型的轻微误设具有稳健性,仅产生二阶偏差。我们通过在半参数双参数β-几何失效模型下估计长期生存概率来说明该方法。