We consider the challenges associated with causal inference in settings where data from a randomized trial is augmented with control data from an external source to improve efficiency in estimating the average treatment effect (ATE). Through the development of a formal causal inference framework, we outline sufficient causal assumptions about the exchangeability between the internal and external controls to identify the ATE and establish the connection to a novel graphical criteria. We propose estimators, review efficiency bounds, develop an approach for efficient doubly-robust estimation even when unknown nuisance models are estimated with flexible machine learning methods, and demonstrate finite-sample performance through a simulation study. To illustrate the ideas and methods, we apply the framework to a trial investigating the effect of risdisplam on motor function in patients with spinal muscular atrophy for which there exists an external set of control patients from a previous trial.
翻译:我们探讨了在随机试验数据与外部来源的对照数据相结合以提高平均处理效应(ATE)估计效率的背景下,因果推断所面临的挑战。通过建立正式的因果推断框架,我们概述了关于内部与外部对照之间可交换性的充分因果假设,以识别平均处理效应并建立与新颖图形准则的联系。我们提出了估计量,回顾了效率边界,开发了一种即使在未知干扰模型通过灵活机器学习方法估计时也能实现高效双稳健估计的方法,并通过模拟研究展示了有限样本下的性能。为阐明思路与方法,我们将该框架应用于一项研究risdisplam对脊髓性肌萎缩症患者运动功能影响的试验,该试验存在来自先前试验的外部对照患者集。