Randomized trials are considered the gold standard for making informed decisions in medicine, yet they often lack generalizability to the patient populations in clinical practice. Observational studies, on the other hand, cover a broader patient population but are prone to various biases. Thus, before using an observational study for decision-making, it is crucial to benchmark its treatment effect estimates against those derived from a randomized trial. We propose a novel strategy to benchmark observational studies beyond the average treatment effect. First, we design a statistical test for the null hypothesis that the treatment effects estimated from the two studies, conditioned on a set of relevant features, differ up to some tolerance. We then estimate an asymptotically valid lower bound on the maximum bias strength for any subgroup in the observational study. Finally, we validate our benchmarking strategy in a real-world setting and show that it leads to conclusions that align with established medical knowledge.
翻译:随机试验被视为医学领域做出知情决策的金标准,但其结果往往难以推广至临床实践中的患者群体。相比之下,观察性研究覆盖了更广泛的患者群体,却容易受到各种偏差的影响。因此,在将观察性研究用于决策之前,必须将其治疗效应估计与随机试验得出的结果进行基准对比。我们提出了一种超越平均治疗效应的新型观察性研究基准化策略。首先,我们设计了一种统计检验,用于测试两个研究在给定相关特征集条件下估计的治疗效应是否在容许范围内存在差异的原假设。然后,我们估计了观察性研究中任意亚组最大偏差强度的渐近有效下界。最后,我们在现实场景中验证了该基准化策略,并证明其得出的结论与现有医学知识一致。