Benchmarking covariate-adjustment strategies for randomized clinical trials

Background and Objective: Covariate adjustment can improve precision and power in randomized clinical trials and is recommended by major regulatory agencies. However, there is limited empirical evidence on how different adjustment strategies perform across diverse real-world trials, leaving uncertainty about which methods and covariates should be prespecified in statistical analysis plans. We aim to address this gap and provide practical recommendations. Methods: We conducted a large-scale empirical study using individual-level data from 50 publicly available randomized trials (29,094 participants; 574 treatment-outcome comparisons). We compared commonly used covariate-adjusted estimators, including analysis of covariance, inverse-probability weighting, g-computation, and machine-learning-based approaches, combined with three covariate-selection strategies. Performance was evaluated using precision gains, changes in point estimates, computational reliability, and the probability that covariate adjustment altered statistical significance relative to an unadjusted analysis. Results: Covariate adjustment improved precision in most settings, with a median variance reduction of 13.3\% for continuous outcomes and 4.6\% for binary outcomes. Parsimonious regression approaches using a small prespecified set of prognostic covariates performed as well as or better than more complex methods, particularly in small to medium samples. Machine-learning-based estimators did not provide additional precision and were more prone to computational failure for binary outcomes. Conclusions: Across trials, parsimonious covariate adjustment provided consistent efficiency gains without introducing systematic bias. These findings support routine covariate adjustment in primary trial analyses. All curated datasets and analysis code are openly released to support future clinical research.

翻译：背景与目的：协变量调整能提高随机对照试验的精度与统计效能，并获主要监管机构推荐。然而，关于不同调整策略在各类真实试验中的表现，现有实证证据有限，导致在统计分析计划中应预先指定哪些方法与协变量存在不确定性。本研究旨在填补这一空白并提供实用建议。方法：基于50项公开随机试验（29,094名参与者；574项治疗-结局比较）的个体水平数据开展大规模实证研究。我们比较了常用协变量调整估计量，包括协方差分析、逆概率加权、g计算及基于机器学习的方法，并结合三种协变量选择策略。通过精度增益、点估计变化、计算可靠性及协变量调整相对于未调整分析改变统计显著性的概率进行性能评估。结果：在多数场景中，协变量调整均能提升精度，连续结局的中位方差降低13.3%，二分类结局为4.6%。采用少量预定义预后协变量的简约回归方法表现优于或等同于更复杂方法，在小至中等样本量场景中尤为明显。基于机器学习的估计量未带来额外精度提升，且在二分类结局中更易出现计算失败。结论：在不同试验中，简约协变量调整能在不引入系统性偏倚的前提下持续提升效率。这些发现支持在主要试验分析中常规采用协变量调整。所有整理的数据集与分析代码均已公开，以支持未来临床研究。