Randomized trials are considered the gold standard for making informed decisions in medicine, yet they often lack generalizability to the patient populations in clinical practice. Observational studies, on the other hand, cover a broader patient population but are prone to various biases. Thus, before using an observational study for decision-making, it is crucial to benchmark its treatment effect estimates against those derived from a randomized trial. We propose a novel strategy to benchmark observational studies beyond the average treatment effect. First, we design a statistical test for the null hypothesis that the treatment effects estimated from the two studies, conditioned on a set of relevant features, differ up to some tolerance. We then estimate an asymptotically valid lower bound on the maximum bias strength for any subgroup in the observational study. Finally, we validate our benchmarking strategy in a real-world setting and show that it leads to conclusions that align with established medical knowledge.
翻译:随机试验被视为医学决策的金标准,但其结果往往难以推广至临床实践中的患者群体。另一方面,观察性研究覆盖的患者群体更广,却容易受到多种偏倚的影响。因此,在利用观察性研究进行决策前,将其治疗效果估计值与随机试验得出的估计值进行基准比对至关重要。我们提出了一种超越平均治疗效果的新策略来对观察性研究进行基准评估。首先,我们设计了一种统计检验,其原假设为:在给定一组相关特征的条件下,两项研究估计的治疗效果差异不超过某个容许范围。随后,我们估计了观察性研究中任意亚组最大偏倚强度的渐近有效下界。最后,我们在真实世界场景中验证了该基准评估策略,并证明其得出的结论与既有的医学知识相一致。