Estimating the counterfactual outcome of treatment is essential for decision-making in public health and clinical science, among others. Often, treatments are administered in a sequential, time-varying manner, leading to an exponentially increased number of possible counterfactual outcomes. Furthermore, in modern applications, the outcomes are high-dimensional and conventional average treatment effect estimation fails to capture disparities in individuals. To tackle these challenges, we propose a novel conditional generative framework capable of producing counterfactual samples under time-varying treatment, without the need for explicit density estimation. Our method carefully addresses the distribution mismatch between the observed and counterfactual distributions via a loss function based on inverse probability re-weighting, and supports integration with state-of-the-art conditional generative models such as the guided diffusion and conditional variational autoencoder. We present a thorough evaluation of our method using both synthetic and real-world data. Our results demonstrate that our method is capable of generating high-quality counterfactual samples and outperforms the state-of-the-art baselines.
翻译:估计治疗的反事实结果对于公共卫生和临床科学等领域中的决策至关重要。通常,治疗以序贯、时变的方式进行,导致可能的反事实结果数量呈指数级增长。此外,在现代应用中,结果是高维的,传统的平均治疗效果估计无法捕捉个体间的差异。为了应对这些挑战,我们提出了一种新型条件生成框架,能够在时变治疗下生成反事实样本,而无需显式的密度估计。我们的方法通过基于逆概率加权重加权的损失函数,仔细解决了观测分布与反事实分布之间的分布不匹配问题,并支持与最先进的条件生成模型(如引导扩散和条件变分自编码器)的集成。我们使用合成数据和真实数据对我们的方法进行了全面评估。结果表明,我们的方法能够生成高质量的反事实样本,并且优于最先进的基线方法。