Randomized controlled trials (RCTs) are the accepted standard for treatment effect estimation but they can be infeasible due to ethical reasons and prohibitive costs. Single-arm trials, where all patients belong to the treatment group, can be a viable alternative but require access to an external control group. We propose an identifiable deep latent-variable model for this scenario that can also account for missing covariate observations by modeling their structured missingness patterns. Our method uses amortized variational inference to learn both group-specific and identifiable shared latent representations, which can subsequently be used for (i) patient matching if treatment outcomes are not available for the treatment group, or for (ii) direct treatment effect estimation assuming outcomes are available for both groups. We evaluate the model on a public benchmark as well as on a data set consisting of a published RCT study and real-world electronic health records. Compared to previous methods, our results show improved performance both for direct treatment effect estimation as well as for effect estimation via patient matching.
翻译:随机对照试验(RCT)是治疗效果估计的公认标准,但由于伦理原因和过高成本,可能难以实施。当所有患者均属于治疗组的单臂试验,虽可作为可行替代方案,但需借助外部对照组。针对此场景,我们提出一种可识别的深度潜在变量模型,该模型还能通过建模协变量观测值的结构化缺失模式来处理数据缺失问题。我们的方法采用摊销变分推断,同时学习组特异性与可识别的共享潜在表征,这些表征可进一步用于:(i) 当治疗组结果不可得时的患者匹配,或 (ii) 假设两组结果均可得时的直接治疗效果估计。我们在公开基准数据集以及包含已发表RCT研究和真实世界电子健康记录的数据集上评估了该模型。与先前方法相比,我们的结果显示,在直接治疗效果估计和基于患者匹配的效果估计中均获得更优性能。