Generating data with properties of interest by external users while following the right causation among its intrinsic factors is important yet has not been well addressed jointly. This is due to the long-lasting challenge of jointly identifying key latent variables, their causal relations, and their correlation with properties of interest, as well as how to leverage their discoveries toward causally controlled data generation. To address these challenges, we propose a novel deep generative framework called the Correlation-aware Causal Variational Auto-encoder (C2VAE). This framework simultaneously recovers the correlation and causal relationships between properties using disentangled latent vectors. Specifically, causality is captured by learning the causal graph on latent variables through a structural causal model, while correlation is learned via a novel correlation pooling algorithm. Extensive experiments demonstrate C2VAE's ability to accurately recover true causality and correlation, as well as its superiority in controllable data generation compared to baseline models.
翻译:外部用户期望生成既遵循内在因素间正确因果关系、又具备特定关注属性的数据,这一联合需求至关重要,但尚未得到充分解决。这源于长期存在的挑战:如何联合识别关键潜变量及其因果关系、它们与关注属性之间的关联,以及如何利用这些发现实现因果可控的数据生成。为应对这些挑战,我们提出了一种名为相关性感知因果变分自编码器(C2VAE)的新型深度生成框架。该框架通过解纠缠的潜向量同时恢复属性间的相关性与因果关系。具体而言,因果关系通过结构因果模型学习潜变量上的因果图来捕获,而相关性则通过一种新颖的相关性池化算法习得。大量实验表明,C2VAE能够准确恢复真实的因果关系与相关性,并在可控数据生成方面优于基线模型。