Many state-of-the-art causal discovery methods aim to generate an output graph that encodes the graphical separation and connection statements of the causal graph that underlies the data-generating process. In this work, we argue that an evaluation of a causal discovery method against synthetic data should include an analysis of how well this explicit goal is achieved by measuring how closely the separations/connections of the method's output align with those of the ground truth. We show that established evaluation measures do not accurately capture the difference in separations/connections of two causal graphs, and we introduce three new measures of distance called s/c-distance, Markov distance and Faithfulness distance that address this shortcoming. We complement our theoretical analysis with toy examples, empirical experiments and pseudocode.
翻译:许多前沿因果发现方法旨在生成输出图,该图编码了数据生成过程中潜在因果图的图分离与连接关系。本研究认为,针对合成数据评估因果发现方法时,应包含对该明确目标的实现程度的分析,即通过衡量方法输出图的分离/连接与真实图的匹配程度。我们证明既有评估指标无法准确捕捉两个因果图在分离/连接上的差异,并提出三种新的距离度量——s/c距离、马尔可夫距离和忠实距离来弥补这一不足。我们通过简化示例、实证实验和伪代码补充了理论分析。