Completely randomized experiment is the gold standard for causal inference. When the covariate information for each experimental candidate is available, one typical way is to include them in covariate adjustments for more accurate treatment effect estimation. In this paper, we investigate this problem under the randomization-based framework, i.e., that the covariates and potential outcomes of all experimental candidates are assumed as deterministic quantities and the randomness comes solely from the treatment assignment mechanism. Under this framework, to achieve asymptotically valid inference, existing estimators usually require either (i) that the dimension of covariates $p$ grows at a rate no faster than $O(n^{3 / 4})$ as sample size $n \to \infty$; or (ii) certain sparsity constraints on the linear representations of potential outcomes constructed via possibly high-dimensional covariates. In this paper, we consider the moderately high-dimensional regime where $p$ is allowed to be in the same order of magnitude as $n$. We develop a novel debiased estimator with a corresponding inference procedure and establish its asymptotic normality under mild assumptions. Our estimator is model-free and does not require any sparsity constraint on potential outcome's linear representations. We also discuss its asymptotic efficiency improvements over the unadjusted treatment effect estimator under different dimensionality constraints. Numerical analysis confirms that compared to other regression adjustment based treatment effect estimators, our debiased estimator performs well in moderately high dimensions.
翻译:完全随机化实验是因果推断的黄金标准。当每个实验参与者的协变量信息可用时,一种典型的方法是将这些协变量纳入调整,以获得更准确的处理效应估计。本文在随机化推断框架下研究此问题,即假设所有实验参与者的协变量与潜在结果均为确定性变量,随机性仅来源于处理分配机制。在此框架下,为获得渐近有效的推断,现有估计量通常需要满足以下条件之一:(i) 协变量维度$p$随样本量$n \to \infty$的增长速率不超过$O(n^{3 / 4})$;或(ii) 通过可能的高维协变量构建的潜在结果线性表示需满足特定的稀疏性约束。本文考虑$p$与$n$处于相同数量级的适度高维情形,提出一种新颖的去偏估计量及其相应的推断方法,并在温和假设下证明其渐近正态性。该估计量具有模型无关性,且不要求潜在结果的线性表示满足任何稀疏性约束。我们还讨论了在不同维度约束下,该估计量相对于未调整处理效应估计量的渐近效率改进。数值分析表明,与其他基于回归调整的处理效应估计量相比,本文提出的去偏估计量在适度高维情形下表现优异。