Bayesian causal discovery aims to infer the posterior distribution over causal models from observed data, quantifying epistemic uncertainty and benefiting downstream tasks. However, computational challenges arise due to joint inference over combinatorial space of Directed Acyclic Graphs (DAGs) and nonlinear functions. Despite recent progress towards efficient posterior inference over DAGs, existing methods are either limited to variational inference on node permutation matrices for linear causal models, leading to compromised inference accuracy, or continuous relaxation of adjacency matrices constrained by a DAG regularizer, which cannot ensure resulting graphs are DAGs. In this work, we introduce a scalable Bayesian causal discovery framework based on stochastic gradient Markov Chain Monte Carlo (SG-MCMC) that overcomes these limitations. Our approach directly samples DAGs from the posterior without requiring any DAG regularization, simultaneously draws function parameter samples and is applicable to both linear and nonlinear causal models. To enable our approach, we derive a novel equivalence to the permutation-based DAG learning, which opens up possibilities of using any relaxed gradient estimator defined over permutations. To our knowledge, this is the first framework applying gradient-based MCMC sampling for causal discovery. Empirical evaluations on synthetic and real-world datasets demonstrate our approach's effectiveness compared to state-of-the-art baselines.
翻译:贝叶斯因果发现旨在从观测数据中推断因果模型的后验分布,从而量化认知不确定性并惠及下游任务。然而,由于需联合推断有向无环图(DAG)的组合空间与非线性函数,计算面临严峻挑战。尽管近期在高效推断DAG后验方面取得进展,现有方法或局限于对线性因果模型的节点置换矩阵进行变分推断(导致推断精度受损),或采用受DAG正则化项约束的邻接矩阵连续松弛(但无法保证所得图结构为DAG)。本文提出一种基于随机梯度马尔可夫链蒙特卡洛(SG-MCMC)的可扩展贝叶斯因果发现框架,克服了上述局限。该方法无需任何DAG正则化即可直接从后验中采样DAG,同时抽取函数参数样本,并适用于线性和非线性因果模型。为实现本方法,我们推导出与基于置换的DAG学习之间的一种新型等价关系,为使用任意定义在置换上的松弛梯度估计器开辟了可能性。据我们所知,这是首个将基于梯度的MCMC采样应用于因果发现的框架。在合成与真实数据集上的实验评估表明,与现有最先进基线相比,我们的方法具有更高有效性。