This paper studies covariate adjusted estimation of the average treatment effect (ATE) in stratified experiments. We work in the stratified randomization framework of Cytrynbaum (2021), which includes matched tuples designs (e.g. matched pairs), coarse stratification, and complete randomization as special cases. Interestingly, we show that the Lin (2013) interacted regression is generically asymptotically inefficient, with efficiency only in the edge case of complete randomization. Motivated by this finding, we derive the optimal linear covariate adjustment for a given stratified design, constructing several new estimators that achieve the minimal variance. Conceptually, we show that optimal linear adjustment of a stratified design is equivalent in large samples to doubly-robust semiparametric adjustment of an independent design. We also develop novel asymptotically exact inference for the ATE over a general family of adjusted estimators, showing in simulations that the usual Eicker-Huber-White confidence intervals can significantly overcover. Our inference methods produce shorter confidence intervals by fully accounting for the precision gains from both covariate adjustment and stratified randomization. Simulation experiments and an empirical application to the Oregon Health Insurance Experiment data (Finkelstein et al. (2012)) demonstrate the value of our proposed methods.
翻译:本文研究了分层实验中平均处理效应(ATE)的协变量调整估计。我们基于Cytrynbaum(2021)的分层随机化框架展开工作,该框架涵盖了匹配元组设计(如配对设计)、粗分层和完全随机化等特例。有趣的是,我们证明Lin(2013)的交互回归通常具有渐近无效性,仅在完全随机化的边界情形下有效。基于此发现,我们推导出给定分层设计下的最优线性协变量调整方法,构建了若干实现最小方差的新估计量。从概念上,我们证明分层设计的最优线性调整在大样本中等价于独立设计的双稳健半参数调整。我们还针对一般调整估计量族开发了新颖的渐近精确ATE推断方法,模拟显示常规Eicker-Huber-White置信区间存在显著过度覆盖现象。通过完全考虑协变量调整和分层随机化带来的精度增益,我们的推断方法可生成更短的置信区间。仿真实验和俄勒冈健康保险实验数据(Finkelstein等,2012)的应用案例证明所提方法的实用价值。