Learning DAG structures from purely observational data remains a long-standing challenge across scientific domains. An emerging line of research leverages the score of the data distribution to initially identify a topological order of the underlying DAG via leaf node detection and subsequently performs edge pruning for graph recovery. This paper extends the score matching framework for causal discovery, which is originally designated for continuous data, and introduces a novel leaf discriminant criterion based on the discrete score function. Through simulated and real-world experiments, we demonstrate that our theory enables accurate inference of true causal orders from observed discrete data and the identified ordering can significantly boost the accuracy of existing causal discovery baselines on nearly all of the settings.
翻译:从纯观测数据中学习有向无环图结构仍然是跨科学领域长期存在的挑战。一个新兴研究方向利用数据分布的得分函数,通过叶节点检测初步识别底层有向无环图的拓扑排序,随后进行边剪枝以完成图结构恢复。本文扩展了最初为连续数据设计的因果发现得分匹配框架,提出了一种基于离散得分函数的新型叶节点判别准则。通过模拟和真实世界实验,我们证明所提出的理论能够从观测离散数据中准确推断真实因果排序,且所识别的排序能在几乎所有设定下显著提升现有因果发现基线的准确性。