Bayesian causal discovery aims to infer the posterior distribution over causal models from observed data, quantifying epistemic uncertainty and benefiting downstream tasks. However, computational challenges arise due to joint inference over combinatorial space of Directed Acyclic Graphs (DAGs) and nonlinear functions. Despite recent progress towards efficient posterior inference over DAGs, existing methods are either limited to variational inference on node permutation matrices for linear causal models, leading to compromised inference accuracy, or continuous relaxation of adjacency matrices constrained by a DAG regularizer, which cannot ensure resulting graphs are DAGs. In this work, we introduce a scalable Bayesian causal discovery framework based on a combination of stochastic gradient Markov Chain Monte Carlo (SG-MCMC) and Variational Inference (VI) that overcomes these limitations. Our approach directly samples DAGs from the posterior without requiring any DAG regularization, simultaneously draws function parameter samples and is applicable to both linear and nonlinear causal models. To enable our approach, we derive a novel equivalence to the permutation-based DAG learning, which opens up possibilities of using any relaxed gradient estimator defined over permutations. To our knowledge, this is the first framework applying gradient-based MCMC sampling for causal discovery. Empirical evaluation on synthetic and real-world datasets demonstrate our approach's effectiveness compared to state-of-the-art baselines.
翻译:贝叶斯因果发现旨在从观测数据中推断因果模型的后验分布,量化认知不确定性并惠及下游任务。然而,由于需要对有向无环图(DAG)的组合空间与非线性函数进行联合推断,计算挑战随之产生。尽管近期在DAG高效后验推断方面取得进展,现有方法要么局限于线性因果模型的节点置换矩阵变分推断,导致推断精度受损;要么采用受DAG正则化器约束的邻接矩阵连续松弛方法,却无法保证所得图结构为DAG。本研究提出一种可扩展的贝叶斯因果发现框架,融合随机梯度马尔可夫链蒙特卡洛(SG-MCMC)与变分推断(VI),克服上述局限。我们的方法可直接从后验分布中采样DAG结构而无需任何DAG正则化,同时抽取得函数参数样本,并适用于线性和非线性因果模型。为实现该方法,我们推导出与基于置换的DAG学习之间的新等价关系,从而为使用任何定义在置换上的松弛梯度估计器开辟可能性。据我们所知,这是首个将基于梯度的MCMC采样应用于因果发现的框架。在合成数据集与真实数据集上的实验评估表明,相较于最先进基线方法,我们的方法具有显著有效性。