Why does a phenomenon occur? Addressing this question is central to most scientific inquiries and often relies on simulations of scientific models. As models become more intricate, deciphering the causes behind phenomena in high-dimensional spaces of interconnected variables becomes increasingly challenging. Causal Representation Learning (CRL) offers a promising avenue to uncover interpretable causal patterns within these simulations through an interventional lens. However, developing general CRL frameworks suitable for practical applications remains an open challenge. We introduce Targeted Causal Reduction (TCR), a method for condensing complex intervenable models into a concise set of causal factors that explain a specific target phenomenon. We propose an information theoretic objective to learn TCR from interventional data of simulations, establish identifiability for continuous variables under shift interventions and present a practical algorithm for learning TCRs. Its ability to generate interpretable high-level explanations from complex models is demonstrated on toy and mechanical systems, illustrating its potential to assist scientists in the study of complex phenomena in a broad range of disciplines.
翻译:为何某一现象会发生?解答这一问题构成了大多数科学探索的核心,并通常依赖于科学模型的模拟。随着模型日益复杂,在相互关联变量的高维空间中解读现象背后的成因变得愈发困难。因果表征学习(CRL)通过干预视角,为揭示这些模拟中可解释的因果模式提供了一条有前景的路径。然而,开发适用于实际应用的通用CRL框架仍是一个开放的挑战。我们提出了目标因果约简(TCR),这是一种将复杂的可干预模型压缩为一组简洁因果因子的方法,这些因子用于解释特定的目标现象。我们提出了一个信息论目标,用于从模拟的干预数据中学习TCR,为连续变量在平移干预下建立了可识别性,并提出了一种学习TCR的实用算法。其在玩具系统和机械系统上展示了从复杂模型中生成可解释的高层解释的能力,说明了其有潜力在广泛的学科领域中协助科学家研究复杂现象。