DR-VIDAL -- Doubly Robust Variational Information-theoretic Deep Adversarial Learning for Counterfactual Prediction and Treatment Effect Estimation on Real World Data

稳健性 · 估计/估计量 · 对抗学习 · Learning · 变分自编码 ·

2023 年 5 月 7 日

翻译：DR-VIDAL——面向真实世界数据的反事实预测与治疗效果估计的倍增鲁棒变分信息论深度对抗学习

Shantanu Ghosh,Zheng Feng,Jiang Bian,Kevin Butler,Mattia Prosperi

from arxiv, AMIA Annual Symposium, 2022 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10148269/)

Determining causal effects of interventions onto outcomes from real-world, observational (non-randomized) data, e.g., treatment repurposing using electronic health records, is challenging due to underlying bias. Causal deep learning has improved over traditional techniques for estimating individualized treatment effects (ITE). We present the Doubly Robust Variational Information-theoretic Deep Adversarial Learning (DR-VIDAL), a novel generative framework that combines two joint models of treatment and outcome, ensuring an unbiased ITE estimation even when one of the two is misspecified. DR-VIDAL integrates: (i) a variational autoencoder (VAE) to factorize confounders into latent variables according to causal assumptions; (ii) an information-theoretic generative adversarial network (Info-GAN) to generate counterfactuals; (iii) a doubly robust block incorporating treatment propensities for outcome predictions. On synthetic and real-world datasets (Infant Health and Development Program, Twin Birth Registry, and National Supported Work Program), DR-VIDAL achieves better performance than other non-generative and generative methods. In conclusion, DR-VIDAL uniquely fuses causal assumptions, VAE, Info-GAN, and doubly robustness into a comprehensive, performant framework. Code is available at: https://github.com/Shantanu48114860/DR-VIDAL-AMIA-22 under MIT license.

翻译：从真实世界观察性（非随机化）数据（例如利用电子健康记录进行药物重定位）中确定干预措施对结果的因果效应，因存在潜在偏差而极具挑战性。因果深度学习在估计个体化治疗效果（ITE）方面已较传统技术有所改进。我们提出倍增鲁棒变分信息论深度对抗学习（DR-VIDAL），这是一种新颖的生成式框架，它结合了两种治疗与结果的联合模型，即使其中一种模型被错误设定，仍能确保无偏的ITE估计。DR-VIDAL整合了以下组件：（i）变分自编码器（VAE），用于根据因果假设将混杂因素分解为潜变量；（ii）信息论生成对抗网络（Info-GAN），用于生成反事实样本；（iii）倍增鲁棒模块，该模块融合治疗倾向性以进行结果预测。在合成数据集及真实世界数据集（婴儿健康与发展项目、双胞胎出生登记处及国家支持工作计划）上，DR-VIDAL相较于其他非生成式与生成式方法取得了更优性能。总之，DR-VIDAL将因果假设、VAE、Info-GAN与倍增鲁棒性独特地融合为一个高效的综合框架。代码已开源至https://github.com/Shantanu48114860/DR-VIDAL-AMIA-22，遵循MIT许可协议。