The integration of real-world data (RWD) and randomized controlled trials (RCT) is increasingly important for advancing causal inference in scientific research. This combination holds great promise for enhancing the efficiency of causal effect estimation, offering benefits such as reduced trial participant numbers and expedited drug access for patients. Despite the availability of numerous data fusion methods, selecting the most appropriate one for a specific research question remains challenging. This paper systematically reviews and compares these methods regarding their assumptions, limitations, and implementation complexities. Through simulations reflecting real-world scenarios, we identify a prevalent risk-reward trade-off across different methods. We investigate and interpret this trade-off, providing key insights into the strengths and weaknesses of various methods; thereby helping researchers navigate through the application of data fusion for improved causal inference.
翻译:在科学研究中,整合真实世界数据(RWD)与随机对照试验(RCT)对于推进因果推断日益重要。这种结合在提升因果效应估计效率方面展现出巨大潜力,能够带来减少试验参与者数量、加速患者药物可及性等益处。尽管已有多种数据融合方法,但针对特定研究问题选择最合适的方法仍具挑战性。本文系统回顾并比较了这些方法在假设条件、局限性及实施复杂度方面的差异。通过模拟真实世界场景,我们发现不同方法普遍存在风险与收益的权衡关系。我们深入探究并阐释了这种权衡,为各类方法的优势与不足提供了关键见解,从而帮助研究者在应用数据融合改进因果推断时做出明智选择。