Bayesian ICA for Causal Discovery

Causal discovery based on Independent Component Analysis (ICA) has achieved remarkable success through the LiNGAM framework, which exploits non-Gaussianity and independence of noise variables to identify causal order. However, classical LiNGAM methods rely on the strong assumption that there exists an ordering under which the noise terms are exactly independent, an assumption that is often violated in the presence of confounding. In this paper, we propose a general information-theoretic framework for causal order estimation that remains applicable under arbitrary confounding. Rather than imposing independence as a hard constraint, we quantify the degree of confounding by the multivariate mutual information among the noise variables. This quantity is decomposed into a sum of mutual information terms along a causal order and is estimated using Bayesian marginal likelihoods. The resulting criterion can be interpreted as Bayesian ICA for causal discovery, where causal order selection is formulated as a model selection problem over permutations. Under standard regularity conditions, we show that the proposed Bayesian mutual information estimator is consistent, with redundancy of order $O(\log n)$. To avoid non-identifiability caused by Gaussian noise, we employ non-Gaussian predictive models, including multivariate $t$ distributions, whose marginal likelihoods can be evaluated via MCMC. The proposed method recovers classical LiNGAM and DirectLiNGAM as limiting cases in the absence of confounding, while providing a principled ranking of causal orders when confounding is present. This establishes a unified, confounding-aware, and information-theoretically grounded extension of ICA-based causal discovery.

翻译：基于独立成分分析（ICA）的因果发现通过LiNGAM框架取得了显著成功，该框架利用噪声变量的非高斯性和独立性来识别因果顺序。然而，经典LiNGAM方法依赖于一个强假设，即存在一个顺序使得噪声项完全独立，这一假设在存在混杂因素时常常被违反。本文提出了一种通用的信息论框架用于因果顺序估计，该框架在任意混杂条件下仍保持适用性。我们不再将独立性作为硬约束，而是通过噪声变量之间的多元互信息来量化混杂程度。该量被分解为沿因果顺序的互信息项之和，并使用贝叶斯边际似然进行估计。所得准则可解释为用于因果发现的贝叶斯ICA，其中因果顺序选择被表述为关于排列的模型选择问题。在标准正则性条件下，我们证明了所提出的贝叶斯互信息估计量具有一致性，其冗余度为$O(\log n)$阶。为避免高斯噪声导致的不可识别性问题，我们采用非高斯预测模型，包括多元$t$分布，其边际似然可通过MCMC进行评估。所提方法在无混杂情况下可退化为经典LiNGAM和DirectLiNGAM，同时在存在混杂时提供因果顺序的原则性排序。这为基于ICA的因果发现建立了一个统一的、考虑混杂因素的、信息论基础坚实的扩展框架。