Estimating the counterfactual outcome of treatment is essential for decision-making in public health and clinical science, among others. Often, treatments are administered in a sequential, time-varying manner, leading to an exponentially increased number of possible counterfactual outcomes. Furthermore, in modern applications, the outcomes are high-dimensional and conventional average treatment effect estimation fails to capture disparities in individuals. To tackle these challenges, we propose a novel conditional generative framework capable of producing counterfactual samples under time-varying treatment, without the need for explicit density estimation. Our method carefully addresses the distribution mismatch between the observed and counterfactual distributions via a loss function based on inverse probability re-weighting, and supports integration with state-of-the-art conditional generative models such as the guided diffusion and conditional variational autoencoder. We present a thorough evaluation of our method using both synthetic and real-world data. Our results demonstrate that our method is capable of generating high-quality counterfactual samples and outperforms the state-of-the-art baselines.
翻译:在公共卫生与临床科学等领域,估计干预的反事实结果对决策制定至关重要。干预措施通常以序贯、时变的方式实施,导致可能的反事实结果数量呈指数级增长。此外,在现代应用中,结果变量常为高维数据,而传统的平均干预效应估计方法难以捕捉个体间的异质性。为应对这些挑战,我们提出了一种新颖的条件生成框架,能够在时变干预下生成反事实样本,且无需显式密度估计。该方法通过基于逆概率加权的损失函数,精细处理观测分布与反事实分布间的失配问题,并支持与前沿条件生成模型(如引导扩散模型和条件变分自编码器)的集成。我们使用合成数据与真实世界数据对该方法进行了全面评估。结果表明,本方法能够生成高质量的反事实样本,其性能优于当前最先进的基线模型。