Causal representation learning (CRL) aims to learn low-dimensional causal latent variables from high-dimensional observations. While identifiability has been extensively studied for CRL, estimation has been less explored. In this paper, we explore the use of empirical Bayes (EB) to estimate causal representations. In particular, we consider the problem of learning from data from multiple domains, where differences between domains are modeled by interventions in a shared underlying causal model. Multi-domain CRL naturally poses a simultaneous inference problem that EB is designed to tackle. Here, we propose an EB $f$-modeling algorithm that improves the quality of learned causal variables by exploiting invariant structure within and across domains. Specifically, we consider a linear measurement model and interventional priors arising from a shared acyclic SCM. When the graph and intervention targets are known, we develop an EM-style algorithm based on causally structured score matching. We further discuss EB $\rmg$-modeling in the context of existing CRL approaches. In experiments on synthetic data, our proposed method achieves more accurate estimation than other methods for CRL.
翻译:因果表示学习(CRL)旨在从高维观测中学习低维因果潜变量。尽管CRL的可识别性已被广泛研究,但其估计问题却鲜有探讨。本文探索利用经验贝叶斯(EB)估计因果表示。特别地,我们考虑了从多领域数据中学习的问题——不同领域之间的差异由共享底层因果模型中的干预建模。多领域CRL天然构成了一个同时推断问题,而EB正是为此设计。在此,我们提出一种EB f-建模算法,通过利用领域内及跨领域的不变结构提升所学因果变量的质量。具体而言,我们考虑线性测量模型以及由共享有向无环SCM产生的干预先验。当图结构和干预目标已知时,我们基于因果结构得分匹配开发了一种类EM算法。进一步地,我们在现有CRL方法的背景下讨论了EB g-建模。在合成数据实验中,所提方法的估计精度优于其他CRL方法。