This paper focuses on causal representation learning (CRL) under a general nonparametric latent causal model and a general transformation model that maps the latent data to the observational data. It establishes identifiability and achievability results using two hard uncoupled interventions per node in the latent causal graph. Notably, one does not know which pair of intervention environments have the same node intervened (hence, uncoupled). For identifiability, the paper establishes that perfect recovery of the latent causal model and variables is guaranteed under uncoupled interventions. For achievability, an algorithm is designed that uses observational and interventional data and recovers the latent causal model and variables with provable guarantees. This algorithm leverages score variations across different environments to estimate the inverse of the transformer and, subsequently, the latent variables. The analysis, additionally, recovers the identifiability result for two hard coupled interventions, that is when metadata about the pair of environments that have the same node intervened is known. This paper also shows that when observational data is available, additional faithfulness assumptions that are adopted by the existing literature are unnecessary.
翻译:本文研究在非参数潜因果模型和将潜数据映射到观测数据的一般变换模型下的因果表示学习(CRL)。利用潜因果图中每个节点的两次硬解耦干预,建立了可识别性与可实现性结果。值得注意的是,我们无需知晓哪些成对的干预环境对同一节点进行了干预(因此称之为解耦)。在可识别性方面,本文证明在解耦干预下,潜因果模型与变量能够被完美恢复。在可实现性方面,本文设计了一种算法,该算法利用观测数据和干预数据,在可证明的保证下恢复潜因果模型与变量。该算法通过跨不同环境的分数差异来估计变换器的逆映射,进而估计潜变量。此外,分析还恢复了两次硬耦合干预的可识别性结果,即当已知对同一节点进行干预的成对环境元数据时。本文还表明,当观测数据可用时,现有文献所采用的额外忠实性假设是不必要的。