Causal structure learning (CSL), a prominent technique for encoding cause-and-effect relationships among variables, through Bayesian Networks (BNs). Although recovering causal structure solely from data is a challenge, the integration of prior knowledge, revealing partial structural truth, can markedly enhance learning quality. However, current methods based on prior knowledge exhibit limited resilience to errors in the prior, with hard constraint methods disregarding priors entirely, and soft constraints accepting priors based on a predetermined confidence level, which may require expert intervention. To address this issue, we propose a strategy resilient to edge-level prior errors for CSL, thereby minimizing human intervention. We classify prior errors into different types and provide their theoretical impact on the Structural Hamming Distance (SHD) under the presumption of sufficient data. Intriguingly, we discover and prove that the strong hazard of prior errors is associated with a unique acyclic closed structure, defined as ``quasi-circle''. Leveraging this insight, a post-hoc strategy is employed to identify the prior errors by its impact on the increment of ``quasi-circles''. Through empirical evaluation on both real and synthetic datasets, we demonstrate our strategy's robustness against prior errors. Specifically, we highlight its substantial ability to resist order-reversed errors while maintaining the majority of correct prior.
翻译:因果结构学习(CSL)是一种通过贝立叶网络(BNs)编码变量间因果关系的重要技术。尽管仅从数据中恢复因果结构具有挑战性,但整合揭示部分结构真实性的先验知识能显著提升学习质量。然而,当前基于先验知识的方法对先验误差的鲁棒性有限:硬约束方法完全忽略先验信息,软约束方法则基于预设置信度接受先验,这可能仍需专家干预。为解决此问题,我们提出一种对边级先验误差具有鲁棒性的CSL策略,从而最大限度减少人工干预。我们将先验误差分类为不同类型,并在数据充足的假设下,理论分析了其对结构汉明距离(SHD)的影响。有趣的是,我们发现并证明了先验误差的强危害性与一种独特的无环闭合结构(定义为“准环”)相关。基于此洞见,我们采用事后策略,通过分析先验误差对“准环”数量增长的影响来识别误差。通过对真实数据集和合成数据集的实证评估,我们证明了该策略对先验误差的鲁棒性。特别地,我们强调其在保留大部分正确先验的同时,对顺序反转型误差具有显著的抵抗能力。