In causal inference, it is a fundamental task to estimate the causal effect from observational data. However, latent confounders pose major challenges in causal inference in observational data, for example, confounding bias and M-bias. Recent data-driven causal effect estimators tackle the confounding bias problem via balanced representation learning, but assume no M-bias in the system, thus they fail to handle the M-bias. In this paper, we identify a challenging and unsolved problem caused by a variable that leads to confounding bias and M-bias simultaneously. To address this problem with co-occurring M-bias and confounding bias, we propose a novel Disentangled Latent Representation learning framework for learning latent representations from proxy variables for unbiased Causal effect Estimation (DLRCE) from observational data. Specifically, DLRCE learns three sets of latent representations from the measured proxy variables to adjust for the confounding bias and M-bias. Extensive experiments on both synthetic and three real-world datasets demonstrate that DLRCE significantly outperforms the state-of-the-art estimators in the case of the presence of both confounding bias and M-bias.
翻译:在因果推断中,从观测数据中估计因果效应是一项基本任务。然而,潜在混杂因素给观测数据中的因果推断带来了重大挑战,例如混杂偏差和M偏差。近期基于数据驱动的因果效应估计器通过平衡表示学习解决了混杂偏差问题,但假设系统中不存在M偏差,因而无法处理M偏差。本文识别出一个由变量同时引发混杂偏差和M偏差所带来的尚未解决的难题。针对这一M偏差与混杂偏差并存的问题,我们提出了一种新颖的**解缠潜表示学习框架**,该框架从代理变量中学习潜表示,用于从观测数据中进行无偏因果效应估计(DLRCE)。具体而言,DLRCE从测量的代理变量中学习三组潜表示,以调整混杂偏差和M偏差。在合成数据集和三个真实世界数据集上的大量实验表明,在同时存在混杂偏差和M偏差的情况下,DLRCE显著优于最先进的估计器。