DR-VIDAL -- Doubly Robust Variational Information-theoretic Deep Adversarial Learning for Counterfactual Prediction and Treatment Effect Estimation on Real World Data

稳健性 · 估计/估计量 · 对抗学习 · Learning · 变分自编码 ·

2023 年 5 月 4 日

翻译：DR-VIDAL——面向真实世界数据反事实预测与治疗效果估计的稳健双重变分信息论深度对抗学习

Shantanu Ghosh,Zheng Feng,Jiang Bian,Kevin Butler,Mattia Prosperi

from arxiv, AMIA Annual Symposium, 2022 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10148269/)

Determining causal effects of interventions onto outcomes from real-world, observational (non-randomized) data, e.g., treatment repurposing using electronic health records, is challenging due to underlying bias. Causal deep learning has improved over traditional techniques for estimating individualized treatment effects (ITE). We present the Doubly Robust Variational Information-theoretic Deep Adversarial Learning (DR-VIDAL), a novel generative framework that combines two joint models of treatment and outcome, ensuring an unbiased ITE estimation even when one of the two is misspecified. DR-VIDAL integrates: (i) a variational autoencoder (VAE) to factorize confounders into latent variables according to causal assumptions; (ii) an information-theoretic generative adversarial network (Info-GAN) to generate counterfactuals; (iii) a doubly robust block incorporating treatment propensities for outcome predictions. On synthetic and real-world datasets (Infant Health and Development Program, Twin Birth Registry, and National Supported Work Program), DR-VIDAL achieves better performance than other non-generative and generative methods. In conclusion, DR-VIDAL uniquely fuses causal assumptions, VAE, Info-GAN, and doubly robustness into a comprehensive, performant framework. Code is available at: https://github.com/Shantanu48114860/DR-VIDAL-AMIA-22 under MIT license.

翻译：从真实世界观测（非随机化）数据（例如利用电子健康记录进行药物重定位）中确定干预对结果的影响具有挑战性，这主要源于潜在的偏差。因果深度学习在估计个体化治疗效果方面已优于传统技术。我们提出了稳健双重变分信息论深度对抗学习（DR-VIDAL），这是一种新颖的生成式框架，结合了治疗与结果的两个联合模型，即使其中一个模型设定错误也能确保无偏的ITE估计。DR-VIDAL整合了：（i）变分自编码器，根据因果假设将混杂因素分解为潜在变量；（ii）信息论生成对抗网络，用于生成反事实；（iii）稳健双重模块，整合治疗倾向性以进行结果预测。在合成数据集和真实世界数据集（婴幼儿健康与发展项目、双胞胎出生登记册、国家支持工作计划）上，DR-VIDAL比非生成式和生成式方法取得了更优性能。总之，DR-VIDAL独特地将因果假设、VAE、Info-GAN与稳健双重性融合为一个综合且高性能的框架。代码可在MIT许可下从https://github.com/Shantanu48114860/DR-VIDAL-AMIA-22获取。