The task of causal representation learning aims to uncover latent higher-level causal representations that affect lower-level observations. Identifying true latent causal representations from observed data, while allowing instantaneous causal relations among latent variables, remains a challenge, however. To this end, we start from the analysis of three intrinsic properties in identifying latent space from observations: transitivity, permutation indeterminacy, and scaling indeterminacy. We find that transitivity acts as a key role in impeding the identifiability of latent causal representations. To address the unidentifiable issue due to transitivity, we introduce a novel identifiability condition where the underlying latent causal model satisfies a linear-Gaussian model, in which the causal coefficients and the distribution of Gaussian noise are modulated by an additional observed variable. Under some mild assumptions, we can show that the latent causal representations can be identified up to trivial permutation and scaling. Furthermore, based on this theoretical result, we propose a novel method, termed Structural caUsAl Variational autoEncoder, which directly learns latent causal representations and causal relationships among them, together with the mapping from the latent causal variables to the observed ones. We show that the proposed method learns the true parameters asymptotically. Experimental results on synthetic and real data demonstrate the identifiability and consistency results and the efficacy of the proposed method in learning latent causal representations.
翻译:摘要:因果表征学习的目标是揭示影响低层观测的潜在高层因果表征。然而,从观测数据中识别真实的潜在因果表征,同时允许潜变量之间存在瞬时因果关系,仍然是一个挑战。为此,我们从观测数据中识别潜在空间时所涉及的三个内在属性出发进行分析:传递性、排列不确定性和缩放不确定性。我们发现传递性在阻碍潜在因果表征的可识别性中起着关键作用。为了解决因传递性导致的不可识别问题,我们引入了一种新的可识别性条件,其中潜在的潜在因果模型满足线性-高斯模型,其因果系数和高斯噪声的分布由额外的观测变量调制。在温和假设下,我们可以证明潜在因果表征在忽略平凡的排列和缩放后可被识别。此外,基于这一理论结果,我们提出了一种名为结构因果变分自编码器(Structural caUsAl Variational autoEncoder)的新方法,该方法直接学习潜在因果表征及其之间的因果关系,以及从潜在因果变量到观测变量的映射。我们证明了所提方法渐近地学习真实参数。合成数据和真实数据的实验结果验证了所提方法在可识别性、一致性以及学习潜在因果表征方面的有效性。