Doubly-robust inference and optimality in structure-agnostic models with smoothness

We study the problem of constructing an estimator of the average treatment effect (ATE) with observational data. The celebrated doubly-robust, augmented-IPW (AIPW) estimator generally requires consistent estimation of both nuisance functions for standard root-n inference, and moreover that the product of the errors of the nuisances should shrink at a rate faster than $n^{-1/2}$. A recent strand of research has aimed to understand the extent to which the AIPW estimator can be improved upon (in a minimax sense). Under structural assumptions on the nuisance functions, the AIPW estimator is typically not minimax-optimal, and improvements can be made using higher-order influence functions (Robins et al, 2017). Conversely, without any assumptions on the nuisances beyond the mean-square-error rates at which they can be estimated, the rate achieved by the AIPW estimator is already optimal (Balakrishnan et al, 2023; Jin and Syrgkanis, 2024). We make three main contributions. First, we propose a new hybrid class of distributions that combine structural agnosticism regarding the nuisance function space with additional smoothness constraints. Second, we calculate minimax lower bounds for estimating the ATE in the new class, as well as in the pure structure-agnostic one. Third, we propose a new estimator of the ATE that enjoys doubly-robust asymptotic linearity; it can yield asymptotically valid Wald-type confidence intervals even when the propensity score or the outcome model is inconsistently estimated, or estimated at a slow rate. Under certain conditions, we show that its rate of convergence in the new class can be much faster than that achieved by the AIPW estimator and, in particular, matches the minimax lower bound rate, thereby establishing its optimality. Finally, we complement our theoretical findings with simulations.

翻译：我们研究利用观测数据构建平均处理效应（ATE）估计量的问题。著名的双重稳健增强逆概率加权（AIPW）估计量通常需要对两个干扰函数进行一致估计才能实现标准的根号n推断，且干扰函数误差的乘积应以快于$n^{-1/2}$的速率收敛。近期一系列研究致力于探究AIPW估计量在极小极大意义下可被改进的程度。在干扰函数的结构性假设下，AIPW估计量通常并非极小极大最优，可通过使用高阶影响函数进行改进（Robins等，2017）。反之，若除干扰函数可被估计的均方误差速率外不作任何假设，则AIPW估计量达到的速率已是最优的（Balakrishnan等，2023；Jin与Syrgkanis，2024）。我们作出三项主要贡献：首先，提出一种新的混合分布类别，该类别结合了对干扰函数空间的结构不可知性与额外的平滑性约束；其次，计算了新类别及纯结构不可知类别中估计ATE的极小极大下界；第三，提出一种新的ATE估计量，该估计量具有双重稳健的渐近线性性质——即使倾向得分或结果模型被不一致估计或以较慢速率估计时，仍能产生渐近有效的Wald型置信区间。在特定条件下，我们证明该估计量在新类别中的收敛速率可显著快于AIPW估计量，且与极小极大下界速率匹配，从而确立其最优性。最后，我们通过仿真实验补充了理论发现。