Discovering the causal relationship via recovering the directed acyclic graph (DAG) structure from the observed data is a well-known challenging combinatorial problem. When there are latent variables, the problem becomes even more difficult. In this paper, we first propose a DAG structure recovering algorithm, which is based on the Cholesky factorization of the covariance matrix of the observed data. The algorithm is fast and easy to implement and has theoretical grantees for exact recovery. On synthetic and real-world datasets, the algorithm is significantly faster than previous methods and achieves the state-of-the-art performance. Furthermore, under the equal error variances assumption, we incorporate an optimization procedure into the Cholesky factorization based algorithm to handle the DAG recovering problem with latent variables. Numerical simulations show that the modified "Cholesky + optimization" algorithm is able to recover the ground truth graph in most cases and outperforms existing algorithms.
翻译:从观测数据中通过恢复有向无环图(DAG)结构来发现因果关系是一个众所周知的具有挑战性的组合问题。当存在潜变量时,该问题变得更加困难。本文首先提出了一种基于观测数据协方差矩阵Cholesky分解的DAG结构恢复算法。该算法快速且易于实现,并在精确恢复方面具有理论保证。在合成数据集和真实世界数据集上,该算法显著快于先前方法,并达到了最先进的性能。此外,在等误差方差假设下,我们向基于Cholesky分解的算法中融入优化过程,以处理含潜变量的DAG恢复问题。数值仿真表明,改进后的"Cholesky+优化"算法在多数情况下能够恢复真实图结构,并优于现有算法。