How should covariates be handled in randomized trials? Empirical evidence from 50 trials and recommendations for practice

Background and Objective: Covariate adjustment can improve precision and power in randomized clinical trials and is recommended by major regulatory agencies. However, there is limited empirical evidence on how different adjustment strategies perform across diverse real-world trials, leaving uncertainty about which methods and covariates should be prespecified in statistical analysis plans. We aim to address this gap and provide practical recommendations. Methods: We conducted a large-scale empirical study using individual-level data from 50 publicly available randomized trials (29,094 participants; 574 treatment-outcome comparisons). We compared commonly used covariate-adjusted estimators, including analysis of covariance, inverse-probability weighting, g-computation, and machine-learning-based approaches, combined with three covariate-selection strategies. Performance was evaluated using precision gains, changes in point estimates, computational reliability, and the probability that covariate adjustment altered statistical significance relative to an unadjusted analysis. Results: Covariate adjustment improved precision in most settings, with a median variance reduction of 13.3\% for continuous outcomes and 4.6\% for binary outcomes. Parsimonious regression approaches using a small prespecified set of prognostic covariates performed as well as or better than more complex methods, particularly in small to medium samples. Machine-learning-based estimators did not provide additional precision and were more prone to computational failure for binary outcomes. Conclusions: Across trials, parsimonious covariate adjustment provided consistent efficiency gains without introducing systematic bias. These findings support routine covariate adjustment in primary trial analyses. All curated datasets and analysis code are openly released to support future clinical research.

翻译：背景与目的：协变量调整可提高随机临床试验的精度和统计功效，并受到主要监管机构的推荐。然而，目前关于不同调整策略在多样化真实试验中的表现，缺乏充分实证证据，导致在统计分析计划中应预先指定哪些方法和协变量存在不确定性。本研究旨在填补这一空白并提供实用建议。方法：我们利用50项公开随机试验的个体层面数据（共29,094名参与者；574项治疗-结局比较）开展大规模实证研究。我们比较了常用的协变量调整估计量，包括协方差分析、逆概率加权、g计算及基于机器学习的方法，并结合三种协变量筛选策略。通过精度提升、点估计变化、计算可靠性以及协变量调整相对未调整分析改变统计显著性的概率来评估性能。结果：在大多数场景中协变量调整提升了精度，连续结局的方差中位数降低13.3%，二元结局降低4.6%。使用少量预先指定的预后协变量的简约回归方法表现优于或等同于复杂方法，尤其在中小样本中。基于机器学习的估计量未带来额外精度提升，且在二元结局中更易出现计算失败。结论：在各类试验中，简约协变量调整在未引入系统性偏差的前提下提供了稳定的效率增益。这些发现支持在主要试验分析中常规进行协变量调整。所有整理后的数据集及分析代码均已公开共享，以支持未来临床研究。