Covariate adjustment is widely recommended to improve statistical efficiency in randomized clinical trials (RCTs), yet empirical evidence comparing available strategies remains limited. This lack of real-world evaluation leaves unresolved practical questions about which adjustment methods to use and which covariates to include. To address this gap, we conduct a large-scale empirical benchmarking using individual-level data from 50 publicly accessible RCTs comprising 29,094 participants and 574 treatment-outcome pairs. We evaluate 18 analytical strategies formed by combining six estimators-including classical regression, inverse probability weighting, and machine-learning methods-with three covariate-selection rules. Across diverse therapeutic areas, covariate adjustment consistently improves precision, yielding median variance reductions of 13.3% relative to unadjusted analyses for continuous outcomes and 4.6% for binary outcomes. However, machine-learning algorithms implemented with default hyperparameter settings do not yield efficiency gains beyond simple linear models. Parsimonious regression approaches, such as analysis of covariance, deliver stable, reproducible performance even in moderate sample sizes. Together, these findings provide the first large-scale empirical evidence that transparent and parsimonious covariate adjustment is sufficient and often preferable for routine RCT analysis. All curated datasets and analysis code are openly released as a reproducible benchmark resource to support future clinical research and methodological development.
翻译:协变量调整被广泛推荐用于提高随机临床试验的统计效率,然而比较现有策略的实证证据仍然有限。这种现实世界评估的缺乏使得关于使用何种调整方法及纳入哪些协变量的实际问题悬而未决。为填补这一空白,我们利用来自50项公开可获取的随机临床试验的个体水平数据(包含29,094名参与者和574个治疗-结局对)进行了大规模实证基准评估。我们评估了由六种估计量(包括经典回归、逆概率加权和机器学习方法)与三种协变量选择规则组合形成的18种分析策略。在不同治疗领域中,协变量调整一致提高了精确度,相对于未调整分析,连续结局的方差中位数降低了13.3%,二分类结局降低了4.6%。然而,采用默认超参数设置的机器学习算法并未产生超越简单线性模型的效率增益。简约回归方法(如协方差分析)即使在中等样本量下也能提供稳定、可复现的性能。总之,这些发现首次提供了大规模实证证据,表明透明且简约的协变量调整对于常规随机临床试验分析是充分且通常更优的选择。所有整理的数据集和分析代码均已作为可复现的基准资源公开发布,以支持未来的临床研究和方法学发展。