Treatment effect estimation under unconfoundedness is a fundamental task in causal inference. In response to the challenge of analyzing high-dimensional datasets collected in substantive fields such as epidemiology, genetics, economics, and social sciences, many methods for treatment effect estimation with high-dimensional nuisance parameters (the outcome regression and the propensity score) have been developed in recent years. However, it is still unclear what is the necessary and sufficient sparsity condition on the nuisance parameters for the treatment effect to be $\sqrt{n}$-estimable. In this paper, we propose a new Double-Calibration strategy that corrects the estimation bias of the nuisance parameter estimates computed by regularized high-dimensional techniques and demonstrate that the corresponding Doubly-Calibrated estimator achieves $1 / \sqrt{n}$-rate as long as one of the nuisance parameters is sparse with sparsity below $\sqrt{n} / \log p$, where $p$ denotes the ambient dimension of the covariates, whereas the other nuisance parameter can be arbitrarily complex and completely misspecified. The Double-Calibration strategy can also be applied to settings other than treatment effect estimation, e.g. regression coefficient estimation in the presence of diverging number of controls in a semiparametric partially linear model.
翻译:在不混淆假设下进行处理效应估计是因果推断中的一项基本任务。为应对流行病学、遗传学、经济学和社会科学等实质领域收集的高维数据集分析挑战,近年来已开发出许多针对高维干扰参数(结果回归和倾向得分)的处理效应估计方法。然而,对于处理效应达到$\sqrt{n}$-可估性所需干扰参数的充分必要条件仍不明确。本文提出了一种新的双校准策略,用于修正通过正则化高维技术计算得到的干扰参数估计的偏差,并证明相应的双校准估计量在其中一个干扰参数稀疏度低于$\sqrt{n} / \log p$(其中$p$表示协变量的环境维度)时能达到$1 / \sqrt{n}$-速率,而另一个干扰参数可以任意复杂且完全错误设定。双校准策略还可应用于处理效应估计以外的场景,例如在半参数部分线性模型中存在发散数量控制变量时的回归系数估计。