Blocking, a special case of rerandomization, is routinely implemented in the design stage of randomized experiments to balance the baseline covariates. This study proposes a regression adjustment method based on the least absolute shrinkage and selection operator (Lasso) to efficiently estimate the average treatment effect in randomized block experiments with high-dimensional covariates. We derive the asymptotic properties of the proposed estimator and outline the conditions under which this estimator is more efficient than the unadjusted one. We provide a conservative variance estimator to facilitate valid inferences. Our framework allows one treated or control unit in some blocks and heterogeneous propensity scores across blocks, thus including paired experiments and finely stratified experiments as special cases. We further accommodate rerandomized experiments and a combination of blocking and rerandomization. Moreover, our analysis allows both the number of blocks and block sizes to tend to infinity, as well as heterogeneous treatment effects across blocks without assuming a true outcome data-generating model. Simulation studies and two real-data analyses demonstrate the advantages of the proposed method.
翻译:区组化作为再随机化的一种特例,通常在随机实验的设计阶段实施以平衡基线协变量。本研究提出一种基于最小绝对收缩与选择算子(Lasso)的回归调整方法,用于在高维协变量的随机区组实验中有效估计平均处理效应。我们推导了所提估计量的渐近性质,并阐明了该估计量比未调整估计量更有效的条件。我们提供了一个保守的方差估计量以支持有效统计推断。所提框架允许部分区组仅包含一个处理单元或对照单元,且允许不同区组具有异质性倾向得分,从而将配对实验和精细分层实验纳入特例。我们进一步拓展至再随机化实验以及区组化与再随机化的组合设计。此外,我们的分析允许区组数量和区组大小同时趋于无穷,且在不假设真实结果数据生成模型的前提下,允许处理效应在不同区组间存在异质性。模拟研究和两个实际数据分析验证了所提方法的优势。