Causal disentanglement seeks a representation of data involving latent variables that relate to one another via a causal model. A representation is identifiable if both the latent model and the transformation from latent to observed variables are unique. In this paper, we study observed variables that are a linear transformation of a linear latent causal model. Data from interventions are necessary for identifiability: if one latent variable is missing an intervention, we show that there exist distinct models that cannot be distinguished. Conversely, we show that a single intervention on each latent variable is sufficient for identifiability. Our proof uses a generalization of the RQ decomposition of a matrix that replaces the usual orthogonal and upper triangular conditions with analogues depending on a partial order on the rows of the matrix, with partial order determined by a latent causal model. We corroborate our theoretical results with a method for causal disentanglement that accurately recovers a latent causal model.
翻译:因果解耦旨在寻找一种数据表示,其中潜在变量通过因果模型相互关联。当潜在模型以及从潜在变量到观测变量的变换唯一时,该表示是可辨识的。本文研究了观测变量为线性潜在因果模型的线性变换的情况。干预数据对于可辨识性是必要的:如果某个潜在变量缺失干预,我们证明存在无法区分的不同模型。反之,我们证明对每个潜在变量进行一次干预即可保证可辨识性。我们的证明通过将矩阵的RQ分解推广来实现,用基于矩阵行偏序关系的类比条件替代了通常的正交性和上三角条件,其中偏序关系由潜在因果模型决定。我们通过一种能够准确恢复潜在因果模型的因果解耦方法验证了理论结果。