We establish a general statistical optimality theory for estimation problems where the target parameter is a linear functional of an unknown nuisance component that must be estimated from data. This formulation covers many causal and predictive parameters and has applications to numerous disciplines. We adopt the structure-agnostic framework introduced by \citet{balakrishnan2023fundamental}, which poses no structural properties on the nuisance functions other than access to black-box estimators that achieve some statistical estimation rate. This framework is particularly appealing when one is only willing to consider estimation strategies that use non-parametric regression and classification oracles as black-box sub-processes. Within this framework, we first prove the statistical optimality of the celebrated and widely used doubly robust estimators for the Average Treatment Effect (ATE), the most central parameter in causal inference. We then characterize the minimax optimal rate under the general formulation. Notably, we differentiate between two regimes in which double robustness can and cannot be achieved and in which first-order debiasing yields different error rates. Our result implies that first-order debiasing is simultaneously optimal in both regimes. We instantiate our theory by deriving optimal error rates that recover existing results and extend to various settings of interest, including the case when the nuisance is defined by generalized regressions and when covariate shift exists for training and test distribution.
翻译:我们为估计问题建立了一个通用的统计最优性理论,其中目标参数是未知干扰分量的线性泛函,该分量必须从数据中估计得出。这一表述涵盖了许多因果性和预测性参数,并在众多学科中具有应用价值。我们采用\citet{balakrishnan2023fundamental}提出的结构无关框架,该框架除了要求能够获得达到特定统计估计速率的黑盒估计器外,不对干扰函数施加任何结构性约束。当研究者仅考虑使用非参数回归和分类预言机作为黑盒子过程的估计策略时,该框架尤其具有吸引力。在此框架内,我们首先证明了因果推断中最核心参数——平均处理效应(ATE)的著名且广泛使用的双重稳健估计器的统计最优性。随后,我们刻画了该通用表述下的极小极大最优速率。值得注意的是,我们区分了双重稳健性可实现与不可实现两种机制,以及一阶去偏产生不同误差速率的情况。我们的结果表明,一阶去偏在这两种机制中同时达到最优。通过推导最优误差速率,我们将理论具体化,这些速率不仅恢复了现有结果,还拓展至多种重要场景,包括干扰分量由广义回归定义的情形,以及训练与测试分布存在协变量偏移的情况。