Prognostic Adjustment with Efficient Estimators to Unbiasedly Leverage Historical Data in Randomized Trials

Although randomized controlled trials (RCTs) are a cornerstone of comparative effectiveness, they typically have much smaller sample size than observational studies because of financial and ethical considerations. Therefore there is interest in using plentiful historical data (either observational data or prior trials) to reduce trial sizes. Previous estimators developed for this purpose rely on unrealistic assumptions, without which the added data can bias the treatment effect estimate. Recent work proposed an alternative method (prognostic covariate adjustment) that imposes no additional assumptions and increases efficiency in trial analyses. The idea is to use historical data to learn a prognostic model: a regression of the outcome onto the covariates. The predictions from this model, generated from the RCT subjects' baseline variables, are then used as a covariate in a linear regression analysis of the trial data. In this work, we extend prognostic adjustment to trial analyses with nonparametric efficient estimators, which are more powerful than linear regression. We provide theory that explains why prognostic adjustment improves small-sample point estimation and inference without any possibility of bias. Simulations corroborate the theory: efficient estimators using prognostic adjustment compared to without provides greater power (i.e., smaller standard errors) when the trial is small. Population shifts between historical and trial data attenuate benefits but do not introduce bias. We showcase our estimator using clinical trial data provided by Novo Nordisk A/S that evaluates insulin therapy for individuals with type II diabetes.

翻译：尽管随机对照试验（RCT）是比较有效性研究的基石，但出于经济和伦理考量，其样本量通常远小于观察性研究。因此，学界对利用丰富的历史数据（包括观察性数据或既往试验数据）来缩减试验规模颇感兴趣。为此开发的现有估计量依赖不切实际的假设，违反这些假设会导致治疗效应估计产生偏倚。近期研究提出了一种替代方法（预后协变量调整），该方法无需额外假设即可提升试验分析的效率。其思路是利用历史数据构建预后模型：即结果变量对协变量的回归模型。随后将该模型基于RCT受试者基线变量生成的预测值作为协变量，纳入试验数据的线性回归分析中。本文进一步将预后调整扩展至采用非参数高效估计量的试验分析，这类估计量比线性回归更具效力。我们通过理论阐释了预后调整为何能在完全无偏的前提下改善小样本点估计与推断性能。仿真结果验证了理论：与未采用预后调整相比，采用此方法的有效估计量在小规模试验中能提供更大的统计功效（即更小的标准误）。历史数据与试验数据之间的总体偏移会削弱收益但不会引入偏倚。我们利用诺和诺德公司提供的Ⅱ型糖尿病患者胰岛素治疗临床试验数据展示了所提估计量的应用效果。