This paper focuses on causal representation learning (CRL) under a general nonparametric causal latent model and a general transformation model that maps the latent data to the observational data. It establishes \textbf{identifiability} and \textbf{achievability} results using two hard \textbf{uncoupled} interventions per node in the latent causal graph. Notably, one does not know which pair of intervention environments have the same node intervened (hence, uncoupled environments). For identifiability, the paper establishes that perfect recovery of the latent causal model and variables is guaranteed under uncoupled interventions. For achievability, an algorithm is designed that uses observational and interventional data and recovers the latent causal model and variables with provable guarantees for the algorithm. This algorithm leverages score variations across different environments to estimate the inverse of the transformer and, subsequently, the latent variables. The analysis, additionally, recovers the existing identifiability result for two hard \textbf{coupled} interventions, that is when metadata about the pair of environments that have the same node intervened is known. It is noteworthy that the existing results on non-parametric identifiability require assumptions on interventions and additional faithfulness assumptions. This paper shows that when observational data is available, additional faithfulness assumptions are unnecessary.
翻译:本文聚焦于一般非参数因果潜变量模型及将潜变量映射至观测数据的一般变换模型下的因果表示学习。针对潜变量因果图中每个节点施加两次**非耦合**硬干预,本文建立了**可识别性**与**可实现性**结果——值得注意的是,研究者并不知道哪些干预环境对干预了相同节点(即非耦合环境)。在可识别性方面,论文证明非耦合干预可保证潜变量因果模型与变量的完美恢复;在可实现性方面,论文设计了一种利用观测数据与干预数据的算法,该算法能以可证明保证恢复潜变量因果模型与变量。此算法通过跨环境的分数变化估计变换器逆映射,进而恢复潜变量。此外,本文分析复现了针对两次**耦合**硬干预的现有可识别性结果(即已知哪些环境对干预了相同节点的元数据)。值得强调的是,现有非参数可识别性结果需要干预假设与额外的忠实性假设。本文证明,当观测数据可用时,额外的忠实性假设并非必要。