An essential problem in causal inference is estimating causal effects from observational data. The problem becomes more challenging with the presence of unobserved confounders. When there are unobserved confounders, the commonly used back-door adjustment is not applicable. Although the instrumental variable (IV) methods can deal with unobserved confounders, they all assume that the treatment directly affects the outcome, and there is no mediator between the treatment and the outcome. This paper aims to use the front-door criterion to address the challenging problem with the presence of unobserved confounders and mediators. In practice, it is often difficult to identify the set of variables used for front-door adjustment from data. By leveraging the ability of deep generative models in representation learning, we propose FDVAE to learn the representation of a Front-Door adjustment set with a Variational AutoEncoder, instead of trying to search for a set of variables for front-door adjustment. Extensive experiments on synthetic datasets validate the effectiveness of FDVAE and its superiority over existing methods. The experiments also show that the performance of FDVAE is not sensitive to the causal strength of unobserved confounders and is feasible in the case of dimensionality mismatch between learned representations and the ground truth. We further apply the method to three real-world datasets to demonstrate its potential applications.
翻译:因果推断中的一个核心问题是从观测数据中估计因果效应。当存在未观测混杂因子时,这一问题变得更加困难。在未观测混杂因子存在的情况下,常用的后门调整方法不再适用。尽管工具变量(IV)方法可以处理未观测混杂因子,但它们都假设处理变量直接影响结果变量,且处理变量与结果变量之间不存在中介变量。本文旨在利用前门准则解决存在未观测混杂因子和中介变量时的这一挑战性问题。在实际应用中,从数据中识别用于前门调整的变量集往往十分困难。通过利用深度生成模型在表示学习方面的能力,我们提出了FDVAE方法,该方法使用变分自编码器学习前门调整集的表示,而非试图搜索用于前门调整的变量集。在合成数据集上的大量实验验证了FDVAE的有效性及其相对于现有方法的优越性。实验还表明,FDVAE的性能对未观测混杂因子的因果强度不敏感,并且在学得表示与真实表示维度不匹配的情况下仍具有可行性。我们进一步将该方法应用于三个真实世界数据集,以展示其潜在应用价值。