Causal representation learning (CRL) has garnered increasing interest from the causal inference and artificial intelligence communities due to its potential to disentangle complex data-generating mechanism into causally interpretable latent features by leveraging the heterogeneity of modern datasets. In this paper, we further contribute to the CRL literature, by focusing on the stylized linear structural causal model over latent features and assuming a linear mixing function that maps latent features to the observed data or measurements. Existing linear CRL methods often rely on stringent assumptions, such as access to single-node interventional data or restrictive distributional constraints on latent features and/or exogenous measurement noise. However, these prerequisites can be easy to violate in practice. In this work, we propose a novel linear CRL algorithm that, unlike existing methods, operates under weaker assumptions on environment heterogeneity and data-generating distributions while still recovering latent causal features up to an equivalence class. We further validate our new algorithm via synthetic experiments and an interpretability analysis of large language models, demonstrating both its superiority over competing methods in finite samples and its potential in integrating causality into understanding artificial intelligence. The source code is available at https://github.com/utulie/code_for_linear_crl_paper_creator.
翻译:因果表示学习(CRL)因其利用现代数据集的异质性,将复杂数据生成机制解纠缠为可因果解释的潜在特征的潜力,已引起因果推断和人工智能社区的日益关注。在本文中,我们进一步为CRL文献做出贡献,聚焦于潜在特征上的风格化线性结构因果模型,并假设一个将潜在特征映射到观测数据或测量的线性混合函数。现有的线性CRL方法通常依赖于苛刻的假设,例如访问单节点干预数据或对潜在特征和/或外生测量噪声施加限制性的分布约束。然而,这些先决条件在实践中可能容易遭到违反。在这项工作中,我们提出了一种新颖的线性CRL算法,与现有方法不同,该算法在关于环境异质性和数据生成分布的较弱假设下运行,同时仍能恢复到一个等价类中的潜在因果特征。我们进一步通过合成实验和对大型语言模型的可解释性分析来验证我们的新算法,展示了其在有限样本中优于竞争方法的性能,以及其将因果性整合到理解人工智能中的潜力。源代码可在 https://github.com/utulie/code_for_linear_crl_paper_creator 获取。