Unveil, model, and comprehend the causal mechanisms underpinning natural phenomena stand as fundamental endeavors across myriad scientific disciplines. Meanwhile, new knowledge emerges when discovering causal relationships from data. Existing causal learning algorithms predominantly focus on the isolated effects of variables, overlook the intricate interplay of multiple variables and their collective behavioral patterns. Furthermore, the ubiquity of high-dimensional data exacts a substantial temporal cost for causal algorithms. In this paper, we develop a novel method called MgCSL (Multi-granularity Causal Structure Learning), which first leverages sparse auto-encoder to explore coarse-graining strategies and causal abstractions from micro-variables to macro-ones. MgCSL then takes multi-granularity variables as inputs to train multilayer perceptrons and to delve the causality between variables. To enhance the efficacy on high-dimensional data, MgCSL introduces a simplified acyclicity constraint to adeptly search the directed acyclic graph among variables. Experimental results show that MgCSL outperforms competitive baselines, and finds out explainable causal connections on fMRI datasets.
翻译:揭示、建模并理解自然现象背后的因果机制是众多科学领域的基本追求。与此同时,从数据中发现因果关系能够催生新知识。现有因果学习算法主要关注变量的孤立效应,忽视了多个变量间的复杂交互及其集体行为模式。此外,高维数据的普遍存在给因果算法带来了巨大的时间成本。本文提出了一种名为MgCSL(多粒度因果结构学习)的新方法,该方法首先利用稀疏自编码器探索从微观变量到宏观变量的粗粒化策略与因果抽象。MgCSL随后以多粒度变量作为输入训练多层感知机,深入探究变量间的因果关系。为提升在高维数据上的效能,MgCSL引入了一种简化的无环约束以高效搜索变量间的有向无环图。实验结果表明,MgCSL优于竞争基线方法,并在fMRI数据集上发现了可解释的因果连接。