The goal of causal representation learning is to find a representation of data that consists of causally related latent variables. We consider a setup where one has access to data from multiple domains that potentially share a causal representation. Crucially, observations in different domains are assumed to be unpaired, that is, we only observe the marginal distribution in each domain but not their joint distribution. In this paper, we give sufficient conditions for identifiability of the joint distribution and the shared causal graph in a linear setup. Identifiability holds if we can uniquely recover the joint distribution and the shared causal representation from the marginal distributions in each domain. We transform our identifiability results into a practical method to recover the shared latent causal graph.
翻译:因果表征学习的目标是寻找由因果相关潜变量构成的数据表征。我们考虑一种设定:研究者可获取来自多个领域的数据,这些领域可能共享相同的因果表征。关键之处在于,不同领域的观测被假定为非配对的——即我们仅能观察到每个领域的边缘分布,而无法获知其联合分布。本文在线性设定下给出了联合分布与共享因果图可识别性的充分条件。可识别性成立需满足:能够从各领域的边缘分布中唯一恢复联合分布与共享因果表征。我们将可识别性结论转化为恢复共享潜在因果图的实用方法。