We consider the problem of estimating the average treatment effect (ATE) when both randomized control trial (RCT) data and real-world data (RWD) are available. We decompose the ATE estimand as the difference between a pooled-ATE estimand that integrates RCT and RWD and a bias estimand that captures the conditional effect of RCT enrollment on the outcome. We introduce an adaptive targeted minimum loss-based estimation (A-TMLE) framework to estimate them. We prove that the A-TMLE estimator is root-n-consistent and asymptotically normal. Moreover, in finite sample, it achieves the super-efficiency one would obtain had one known the oracle model for the conditional effect of the RCT enrollment on the outcome. Consequently, the smaller the working model of the bias induced by the RWD is, the greater our estimator's efficiency, while our estimator will always be at least as efficient as an efficient estimator that uses the RCT data only. A-TMLE outperforms existing methods in simulations by having smaller mean-squared-error and 95% confidence intervals. A-TMLE could help utilize RWD to improve the efficiency of randomized trial results without biasing the estimates of intervention effects. This approach could allow for smaller, faster trials, decreasing the time until patients can receive effective treatments.
翻译:本文考虑在同时拥有随机对照试验(RCT)数据与真实世界数据(RWD)时估计平均处理效应(ATE)的问题。我们将ATE估计量分解为整合RCT与RWD的合并ATE估计量与捕捉RCT入组对结局条件效应的偏差估计量之差,并引入自适应目标最小损失估计(A-TMLE)框架进行估计。理论证明该估计量满足根号n一致性和渐近正态性。在有限样本条件下,该估计量能够达到已知RCT入组对结局条件效应Oracle模型时才能获得的超有效性。因此,RWD诱导偏差的工作模型越小,估计量效率越高,且该估计量始终不低于仅使用RCT数据的高效估计量。模拟研究表明A-TMLE的均方误差与95%置信区间均优于现有方法。该方法可在不引入干预效应估计偏差的前提下,利用RWD提升随机试验结果的效率,从而支持开展规模更小、周期更短的试验,缩短患者获得有效治疗的时间。