The task of causal representation learning aims to uncover latent higher-level causal variables that affect lower-level observations. Identifying the true latent causal variables from observed data, while allowing instantaneous causal relations among latent variables, remains a challenge, however. To this end, we start with the analysis of three intrinsic indeterminacies in identifying latent variables from observations: transitivity, permutation indeterminacy, and scaling indeterminacy. We find that transitivity acts as a key role in impeding the identifiability of latent causal variables. To address the unidentifiable issue due to transitivity, we introduce a novel identifiability condition where the underlying latent causal model satisfies a linear-Gaussian model, in which the causal coefficients and the distribution of Gaussian noise are modulated by an additional observed variable. Under certain assumptions, including the existence of a reference condition under which latent causal influences vanish, we can show that the latent causal variables can be identified up to trivial permutation and scaling, and that partial identifiability results can still be obtained when this reference condition is violated for a subset of latent variables. Furthermore, based on these theoretical results, we propose a novel method, termed Structural caUsAl Variational autoEncoder (SuaVE), which directly learns causal representations and causal relationships among them, together with the mapping from the latent causal variables to the observed ones. Experimental results on synthetic and real data demonstrate the identifiability and consistency results and the efficacy of SuaVE in learning causal representations.
翻译:因果表示学习的任务旨在揭示影响低层观测的潜在高层因果变量。然而,在允许潜在变量间存在瞬时因果关系的条件下,从观测数据中识别出真实的潜在因果变量仍然是一个挑战。为此,我们首先分析了从观测中识别潜在变量时存在的三种固有不确定性:传递性、排列不确定性和尺度不确定性。我们发现,传递性在阻碍潜在因果变量的可识别性方面起着关键作用。为解决由传递性导致的不可识别问题,我们引入了一种新的可识别性条件,即基础潜在因果模型满足线性高斯模型,其中因果系数和高斯噪声的分布由一个额外的观测变量进行调节。在某些假设下,包括存在一个参考条件使得潜在因果影响在该条件下消失,我们可以证明潜在因果变量能够被识别至平凡的排列和尺度变换,并且当该参考条件对于部分潜在变量不成立时,仍然可以获得部分可识别性结果。此外,基于这些理论结果,我们提出了一种名为结构因果变分自编码器(SuaVE)的新方法,该方法直接学习因果表示及其间的因果关系,以及从潜在因果变量到观测变量的映射。在合成数据和真实数据上的实验结果验证了可识别性和一致性结果,并证明了SuaVE在学习因果表示方面的有效性。