Augmenting a randomized controlled trial (RCT) with external data may increase power at the risk of introducing bias. To select and analyze the experiment (RCT alone or combined with external data) with the optimal bias-variance tradeoff, we develop a novel experiment-selector cross-validated targeted maximum likelihood estimator for randomized-external data studies (ES-CVTMLE). This estimator utilizes two estimates of bias to determine whether to integrate external data based on 1) a function of the difference in conditional mean outcome under control between the RCT and combined experiments and 2) an estimate of the average treatment effect on a negative control outcome (NCO). We define the asymptotic distribution of the ES-CVTMLE under varying magnitudes of bias and construct confidence intervals by Monte Carlo simulation. We evaluate ES-CVTMLE compared to three other data fusion estimators in simulations and demonstrate the ability of ES-CVTMLE to distinguish biased from unbiased external controls in a real data analysis of the effect of liraglutide on glycemic control from the LEADER trial. The ES-CVTMLE has the potential to improve power while providing relatively robust inference for future hybrid RCT-external data studies.
翻译:在随机对照试验(RCT)中引入外部数据可能提高检验效能,但存在引入偏倚的风险。为选择并分析具有最优偏倚-方差权衡的实验方案(单独使用RCT或结合外部数据),我们开发了一种用于随机-外部数据研究的新型实验选择器交叉验证目标最大似然估计器(ES-CVTMLE)。该估计器利用两种偏倚估计来确定是否整合外部数据,其依据为:1)RCT与组合实验间对照组条件平均结局差异的函数;2)阴性对照结局(NCO)上平均处理效应的估计量。我们定义了ES-CVTMLE在不同偏倚幅度下的渐近分布,并通过蒙特卡洛模拟构建置信区间。在模拟实验中,我们将ES-CVTMLE与另外三种数据融合估计器进行比较,并在利拉鲁肽对血糖控制影响的真实数据分析(基于LEADER试验)中,证明了ES-CVTMLE区分有偏与无偏外部对照的能力。ES-CVTMLE有望在提升检验效能的同时,为未来混合RCT-外部数据研究提供相对稳健的统计推断。