This paper studies covariate adjusted estimation of the average treatment effect in stratified experiments. We work in a general framework that includes matched tuples designs, coarse stratification, and complete randomization as special cases. Regression adjustment with treatment-covariate interactions is known to weakly improve efficiency for completely randomized designs. By contrast, we show that for stratified designs such regression estimators are generically inefficient, potentially even increasing estimator variance relative to the unadjusted benchmark. Motivated by this result, we derive the asymptotically optimal linear covariate adjustment for a given stratification. We construct several feasible estimators that implement this efficient adjustment in large samples. In the special case of matched pairs, for example, the regression including treatment, covariates, and pair fixed effects is asymptotically optimal. Conceptually, we show an equivalence between efficient linear adjustment of a stratified design and doubly-robust semiparametric adjustment of an independent design. We also provide novel asymptotically exact inference methods that allow researchers to report smaller confidence intervals, fully reflecting the efficiency gains from both stratification and adjustment. Simulations and an application to the Oregon Health Insurance Experiment data demonstrate the value of our proposed methods.
翻译:本文研究分层实验中平均处理效应的协变量调整估计。我们在一个一般框架下开展工作,该框架将匹配对设计、粗分层和完全随机化作为特例。已知带有处理-协变量交互项的回归调整可轻微提升完全随机化设计的效率。相比之下,我们证明对于分层设计,此类回归估计量通常效率低下,甚至可能相对于未调整基准增加估计量方差。受这一结果的启发,我们推导出给定分层下的渐近最优线性协变量调整。我们构建了若干可行估计量,在大量样本中实施这一高效调整。例如在匹配对的特殊情形下,包含处理、协变量和对固定效应的回归是渐近最优的。概念上,我们展示了分层设计的高效线性调整与独立设计的双稳健半参数调整之间的等价性。我们还提供了新颖的渐近精确推断方法,使研究者能够报告更小的置信区间,充分反映分层与调整带来的效率提升。仿真实验及对俄勒冈健康保险实验数据的应用验证了我们提出方法的实用价值。