Augmenting randomized controlled trials (RCTs) with external real-world data (RWD) has the potential to improve the finite sample efficiency of treatment effect estimators. We describe using adaptive targeted maximum likelihood estimation (A-TMLE) for estimating the average treatment effect (ATE) by decomposing the ATE estimand into two components: a pooled-ATE estimand that combines data from both the RCT and external sources, and a bias estimand that captures the conditional effect of RCT enrollment on the outcome. This approach views the RCT data as the reference and corrects for inconsistencies of any kind between the RCT and the external data source. Given the growing abundance of external RWD from modern electronic health records, determining the optimal strategy to select candidate external patients for data integration remains an open yet critical problem. In this work, we begin by studying the robustness property of the A-TMLE estimator and then propose a matching-based sampling strategy that attempts to improve the robustness of the estimator with respect to the target estimand. Our proposed strategy is outcome-blind and involves matching based on two one-dimensional scores: the trial enrollment score and the propensity score in the external data. We demonstrate in simulations that our sampling strategy improves the coverage and narrows the widths of confidence intervals produced by A-TMLE. We illustrate our method with a case study of augmenting the DEVOTE cardiovascular safety trial by using the Optum Clinformatics claims database.
翻译:利用外部真实世界数据增强随机对照试验有望改善处理效应估计的有限样本效率。本文描述使用自适应目标最大似然估计通过将平均处理效应估计量分解为两部分:结合随机对照试验和外部数据源的合并平均处理效应目标量,以及捕捉随机对照试验入组对结局条件效应的偏差目标量。该框架将随机对照试验数据作为参照标准,校正随机对照试验与外部数据源之间任何形式的不一致性。鉴于现代电子健康记录中外部真实世界数据的日益丰富,如何确定整合候选外部患者数据的最优策略仍是一个关键开放性问题。本研究首先分析自适应目标最大似然估计量的稳健性性质,进而提出一种基于匹配的抽样策略,旨在提升估计量针对目标估计量的稳健性。所提出的策略不依赖结局变量,基于两个一维得分(试验入组得分和外部数据中的倾向性得分)进行匹配。模拟研究表明,我们的抽样策略能改善自适应目标最大似然估计产生置信区间的覆盖率并缩小其宽度。我们通过利用Optum Clinformatics理赔数据库增强DEVOTE心血管安全性试验的案例研究验证了该方法。