The need for modelling causal knowledge at different levels of granularity arises in several settings. Causal Abstraction provides a framework for formalizing this problem by relating two Structural Causal Models at different levels of detail. Despite increasing interest in applying causal abstraction, e.g. in the interpretability of large machine learning models, the graphical and parametrical conditions under which a causal model can abstract another are not known. Furthermore, learning causal abstractions from data is still an open problem. In this work, we tackle both issues for linear causal models with linear abstraction functions. First, we characterize how the low-level coefficients and the abstraction function determine the high-level coefficients and how the high-level model constrains the causal ordering of low-level variables. Then, we apply our theoretical results to learn high-level and low-level causal models and their abstraction function from observational data. In particular, we introduce Abs-LiNGAM, a method that leverages the constraints induced by the learned high-level model and the abstraction function to speedup the recovery of the larger low-level model, under the assumption of non-Gaussian noise terms. In simulated settings, we show the effectiveness of learning causal abstractions from data and the potential of our method in improving scalability of causal discovery.
翻译:在多种场景下,需要以不同粒度层次对因果知识进行建模。因果抽象通过关联两个不同详细程度的结构因果模型,为此问题提供了形式化框架。尽管因果抽象的应用日益受到关注(例如在大型机器学习模型的可解释性领域),但因果模型能够抽象另一模型所需的图结构和参数条件尚不明确。此外,从数据中学习因果抽象仍是一个开放性问题。本研究针对线性因果模型与线性抽象函数同时探讨了这两个问题。首先,我们刻画了底层系数与抽象函数如何决定高层系数,以及高层模型如何约束底层变量的因果序。随后,我们将理论结果应用于从观测数据中学习高层与底层因果模型及其抽象函数。特别地,我们提出了Abs-LiNGAM方法,该方法在非高斯噪声项的假设下,利用已学习的高层模型和抽象函数所诱导的约束,加速对更大规模底层模型的恢复。在模拟环境中,我们展示了从数据中学习因果抽象的有效性,以及该方法在提升因果发现可扩展性方面的潜力。