Rerandomization is an effective treatment allocation procedure to control for baseline covariate imbalance. For estimating the average treatment effect, rerandomization has been previously shown to improve the precision of the unadjusted and the linearly-adjusted estimators over simple randomization without compromising consistency. However, it remains unclear whether such results apply more generally to the class of M-estimators, including the g-computation formula with generalized linear regression and doubly-robust methods, and more broadly, to efficient estimators with data-adaptive machine learners. In this paper, under a super-population framework, we develop the asymptotic theory for a more general class of covariate-adjusted estimators under rerandomization and its stratified extension. We prove that the asymptotic linearity and the influence function remain identical for any M-estimator under simple randomization and rerandomization, but rerandomization may lead to a non-Gaussian asymptotic distribution. We further explain, drawing examples from several common M-estimators, that asymptotic normality can be achieved if rerandomization variables are appropriately adjusted for in the final estimator. These results are extended to stratified rerandomization. Finally, we study the asymptotic theory for efficient estimators based on data-adaptive machine learners, and prove their efficiency optimality under rerandomization and stratified rerandomization. Our results are demonstrated via simulations and re-analyses of a cluster-randomized experiment that used stratified rerandomization.
翻译:重随机化是一种有效的处理分配程序,用于控制基线协变量的不平衡。在估计平均处理效应时,已有研究证明,与简单随机化相比,重随机化能在不损害一致性的前提下,提高未调整估计量和线性调整估计量的精度。然而,这些结果是否普遍适用于M-估计量类(包括使用广义线性回归的g-计算公式和双重稳健方法),以及更广泛地适用于基于数据自适应机器学习的高效估计量,目前尚不明确。本文在超总体框架下,针对重随机化及其分层扩展,发展了一类更广泛的协变量调整估计量的渐近理论。我们证明,对于任意M-估计量,在简单随机化和重随机化下,其渐近线性性和影响函数保持不变,但重随机化可能导致非高斯渐近分布。我们进一步通过几个常见M-估计量的示例说明,如果在最终估计量中适当调整重随机化变量,即可实现渐近正态性。这些结果被推广到分层重随机化。最后,我们研究了基于数据自适应机器学习的高效估计量的渐近理论,并证明了它们在重随机化和分层重随机化下的效率最优性。通过模拟和一个使用了分层重随机化的整群随机化实验的再分析,验证了我们的结果。