Nonlinear causal discovery from observational data imposes strict identifiability assumptions on the formulation of structural equations utilized in the data generating process. The evaluation of structure learning methods under assumption violations requires a rigorous and interpretable approach, which quantifies both the structural similarity of the estimation with the ground truth and the capacity of the discovered graphs to be used for causal inference. Motivated by the lack of unified performance assessment framework, we introduce an interpretable, six-dimensional evaluation metric, i.e., distance to optimal solution (DOS), which is specifically tailored to the field of causal discovery. Furthermore, this is the first research to assess the performance of structure learning algorithms from seven different families on increasing percentage of non-identifiable, nonlinear causal patterns, inspired by real-world processes. Our large-scale simulation study, which incorporates seven experimental factors, shows that besides causal order-based methods, amortized causal discovery delivers results with comparatively high proximity to the optimal solution.
翻译:从观测数据中进行非线性因果发现对数据生成过程中所用结构方程的形式施加了严格的识别性假设。在假设违反情况下评估结构学习方法需要一种严谨且可解释的评估方式,该方法需同时量化估计结果与真实结构的相似性以及所发现图结构用于因果推断的能力。针对当前缺乏统一性能评估框架的现状,我们提出一种专为因果发现领域设计的可解释六维评估指标——最优解距离(DOS)。此外,本研究首次基于现实过程启发的非可识别非线性因果模式递增比例,系统评估了来自七个不同家族的结构学习算法性能。我们包含七个实验因素的大规模仿真研究表明,除基于因果序的方法外,摊销化因果发现方法所获结果与最优解具有相对较高的接近度。