Randomized trials balance all covariates on average and provide the gold standard for estimating treatment effects. Chance imbalances nevertheless exist more or less in realized treatment allocations and intrigue an important question: what should we do in case the treatment groups differ with respect to some important baseline characteristics? A common strategy is to conduct a {\it preliminary test} of the balance of baseline covariates after randomization, and invoke covariate adjustment for subsequent inference if and only if the realized allocation fails some prespecified criterion. Although such practice is intuitive and popular among practitioners, the existing literature has so far only evaluated its properties under strong parametric model assumptions in theory and simulation, yielding results of limited generality. To fill this gap, we examine two strategies for conducting preliminary test-based covariate adjustment by regression, and evaluate the validity and efficiency of the resulting inferences from the randomization-based perspective. As it turns out, the preliminary-test estimator based on the analysis of covariance can be even less efficient than the unadjusted difference in means, and risks anticonservative confidence intervals based on normal approximation even with the robust standard error. The preliminary-test estimator based on the fully interacted specification is on the other hand less efficient than its counterpart under the {\it always-adjust} strategy, and yields overconservative confidence intervals based on normal approximation. Based on theory and simulation, we echo the existing literature and do not recommend the preliminary-test procedure for covariate adjustment in randomized trials.
翻译:随机化试验在所有协变量上实现平均平衡,并提供了估计处理效应的金标准。然而,实际分配的处理方案中仍不可避免地存在偶然性不平衡,由此引发一个重要问题:当处理组在某些重要基线特征上存在差异时应如何应对?一种常见策略是进行随机化后基线协变量平衡的{\it 预检验},并仅在实现分配不符合预设标准时才启动协变量调整以进行后续推断。尽管这种实践直观且受到实务工作者青睐,但现有文献仅在强参数模型假设下通过理论和模拟评估其性质,所得结论缺乏普适性。为填补这一空白,我们考察了两种基于预检验的回归协变量调整策略,并从随机化视角评估其推断的有效性与效率。结果表明:基于协方差分析的预检验估计量可能比未调整的均值差异效率更低,且即使采用稳健标准误,其基于正态近似的置信区间也存在反保守风险。而基于完全交互规格的预检验估计量在效率上低于"始终调整"策略对应估计量,且其基于正态近似的置信区间呈现过度保守特征。基于理论与模拟结果,我们呼应现有文献观点,不推荐在随机化试验中采用预检验程序进行协变量调整。