We propose an empirical Bayes formulation of the structure learning problem, where the prior specification assumes that all node variables have the same error variance, an assumption known to ensure the identifiability of the underlying causal directed acyclic graph (DAG). To facilitate efficient posterior computation, we approximate the posterior probability of each ordering by that of a best DAG model, which naturally leads to an order-based Markov chain Monte Carlo (MCMC) algorithm. Strong selection consistency for our model in high-dimensional settings is proved under a condition that allows heterogeneous error variances, and the mixing behavior of our sampler is theoretically investigated. Further, we propose a new iterative top-down algorithm, which quickly yields an approximate solution to the structure learning problem and can be used to initialize the MCMC sampler. We demonstrate that our method outperforms other state-of-the-art algorithms under various simulation settings, and conclude the paper with a single-cell real-data study illustrating practical advantages of the proposed method.
翻译:我们提出了一种经验贝叶斯框架下的结构学习问题,其中先验设定假设所有节点变量具有相同的误差方差,该假设已知能够确保潜在因果有向无环图(DAG)的可识别性。为了促进高效的后验计算,我们用最优DAG模型的后验概率近似每个排序的后验概率,这自然引出了一个基于序的马尔可夫链蒙特卡洛(MCMC)算法。在允许异质误差方差的条件下,我们证明了高维环境下该模型的强选择一致性,并从理论上研究了采样器的混合行为。此外,我们提出了一种新的迭代自上而下算法,该算法能快速给出结构学习问题的近似解,并可用于初始化MCMC采样器。我们通过多种模拟场景证明了该方法优于其他最先进算法,并以一项单细胞真实数据研究作为结论,展示了所提方法的实际优势。