Randomized controlled trials are commonly regarded as the gold standard for causal inference and play a pivotal role in modern evidence-based medicine. However, the sample sizes they use are often too limited to draw significant causal conclusions for subgroups that are less prevalent in the population. In contrast, observational data are becoming increasingly accessible in large volumes but can be subject to bias as a result of hidden confounding. Given these complementary features, we propose a power likelihood approach to augmenting RCTs with observational data for robust estimation of heterogeneous treatment effects. We provide a data-adaptive procedure for maximizing the Expected Log Predictive Density (ELPD) to select the influence factor that best regulates the information from the observational data. We conduct a simulation study to illustrate the efficacy of our method and its favourable features compared to existing approaches. Lastly, we apply the proposed method to data from Tennessee's Student Teacher Achievement Ratio (STAR) Study to demonstrate its usefulness and practicality in real-world data analysis.
翻译:随机对照试验通常被视为因果推断的金标准,在现代循证医学中发挥着关键作用。然而,其所使用的样本量往往有限,难以对人群中较少见的亚组得出显著的因果结论。相比之下,观测数据越来越容易大规模获取,但可能因隐藏混杂因素而产生偏倚。鉴于这些互补特性,我们提出了一种幂似然方法,利用观测数据增强随机对照试验,以实现异质性处理效应的稳健估计。我们提供了一种数据自适应程序,通过最大化期望对数预测密度来选择最优调节观测数据信息的影响因子。我们通过模拟研究展示了该方法的有效性及其相对于现有方法的优越特性。最后,我们将所提方法应用于田纳西州学生教师成就比研究的数据,证明了其在真实世界数据分析中的实用性和可行性。