Studies often report estimates of the average treatment effect. While the ATE summarizes the effect of a treatment on average, it does not provide any information about the effect of treatment within any individual. A treatment strategy that uses an individual's information to tailor treatment to maximize benefit is known as an optimal dynamic treatment rule. Treatment, however, is typically not limited to a single point in time; consequently, learning an optimal rule for a time-varying treatment may involve not just learning the extent to which the comparative treatments' benefits vary across the characteristics of individuals, but also learning the extent to which the comparative treatments' benefits vary as relevant circumstances evolve within an individual. The goal of this paper is to provide a tutorial for estimating ODTR from longitudinal observational and clinical trial data for applied researchers. We describe an approach that uses a doubly-robust unbiased transformation of the conditional average treatment effect. We then learn a time-varying ODTR for when to increase buprenorphine-naloxone dose to minimize return-to-regular-opioid-use among patients with opioid use disorder. Our analysis highlights the utility of ODTRs in the context of sequential decision making: the learned ODTR outperforms a clinically defined strategy.
翻译:研究通常报告平均治疗效果(ATE)的估计值。虽然ATE概括了治疗的平均效果,但并未提供任何关于治疗在个体内效果的信息。利用个体信息量身定制以最大化获益的治疗策略称为最优动态治疗规则。然而,治疗通常不局限于单一时间点;因此,学习时变治疗的最优规则不仅涉及了解比较治疗的效果如何随个体特征变化,还涉及了解比较治疗的效果如何随个体内部相关情境的变化而变化。本文旨在为应用研究人员提供从纵向观察性研究和临床试验数据中估计最优动态治疗规则(ODTR)的教程。我们描述了一种利用条件平均治疗效果的双重稳健无偏转换的方法,进而学习时变最优动态治疗规则,以确定何时增加丁丙诺啡-纳洛酮剂量,从而最大程度减少阿片类药物使用障碍患者恢复常规阿片类使用的风险。我们的分析凸显了ODTR在序贯决策中的效用:学习得到的ODTR优于临床定义的策略。