Randomized controlled trials (RCTs) serve as the cornerstone for understanding causal effects, yet extending inferences to target populations presents challenges due to effect heterogeneity and underrepresentation. Our paper addresses the critical issue of identifying and characterizing underrepresented subgroups in RCTs, proposing a novel framework for refining target populations to improve generalizability. We introduce an optimization-based approach, Rashomon Set of Optimal Trees (ROOT), to characterize underrepresented groups. ROOT optimizes the target subpopulation distribution by minimizing the variance of the target average treatment effect estimate, ensuring more precise treatment effect estimations. Notably, ROOT generates interpretable characteristics of the underrepresented population, aiding researchers in effective communication. Our approach demonstrates improved precision and interpretability compared to alternatives, as illustrated with synthetic data experiments. We apply our methodology to extend inferences from the Starting Treatment with Agonist Replacement Therapies (START) trial -- investigating the effectiveness of medication for opioid use disorder -- to the real-world population represented by the Treatment Episode Dataset: Admissions (TEDS-A). By refining target populations using ROOT, our framework offers a systematic approach to enhance decision-making accuracy and inform future trials in diverse populations.
翻译:随机对照试验是理解因果效应的基石,但由于效应异质性和代表性不足,将推论扩展到目标人群面临挑战。本文聚焦于识别和表征随机对照试验中弱势亚群这一关键问题,提出了一种通过优化目标人群来提升泛化能力的新颖框架。我们引入了一种基于优化的方法——最优树的Rashomon集,用于表征弱势群体。ROOT通过最小化目标平均处理效应估计的方差来优化目标亚群分布,从而确保更精确的处理效应估计。值得注意的是,ROOT能够生成弱势人群的可解释特征,有助于研究人员进行有效沟通。与替代方法相比,我们的方法在合成数据实验中展示了更高的精度和可解释性。我们将该方法应用于从阿片类药物使用障碍治疗药物起始试验——该试验旨在探究阿片类药物使用障碍药物的有效性——向以治疗事件数据集:入院数据为代表的真实世界人群进行推论扩展。通过使用ROOT优化目标人群,我们的框架提供了一种系统性方法,能够提升决策准确性并为未来针对多样人群的试验提供参考。