Finding cause-effect relationships is of key importance in science. Causal discovery aims to recover a graph from data that succinctly describes these cause-effect relationships. However, current methods face several challenges, especially when dealing with high-dimensional data and complex dependencies. Incorporating prior knowledge about the system can aid causal discovery. In this work, we leverage Cluster-DAGs as a prior knowledge framework to warm-start causal discovery. We show that Cluster-DAGs offer greater flexibility than existing approaches based on tiered background knowledge and introduce two modified constraint-based algorithms, Cluster-PC and Cluster-FCI, for causal discovery in the fully and partially observed setting, respectively. Empirical evaluation on simulated data demonstrates that Cluster-PC and Cluster-FCI outperform their respective baselines without prior knowledge.
翻译:在科学中,发现因果关系至关重要。因果发现旨在从数据中恢复能够简洁描述这些因果关系的图结构。然而,当前方法面临诸多挑战,尤其是在处理高维数据和复杂依赖关系时。整合关于系统的先验知识有助于因果发现。在本工作中,我们利用聚类有向无环图(Cluster-DAGs)作为先验知识框架来热启动因果发现过程。我们证明,相较于现有基于分层背景知识的方法,Cluster-DAGs 提供了更大的灵活性,并针对完全可观测与部分可观测两种设定,分别提出了两种改进的基于约束的算法——Cluster-PC 与 Cluster-FCI。在模拟数据上的实证评估表明,Cluster-PC 与 Cluster-FCI 的性能均优于无先验知识的各自基线方法。