Constraint-based causal discovery methods require a large number of conditional independence (CI) tests, which severely limits their practical applicability due to high computational complexity. Therefore, it is crucial to design an algorithm that accelerates each individual test. To this end, we propose the Flow Matching-based Conditional Independence Test (FMCIT). The proposed test leverages the high computational efficiency of flow matching and requires the model to be trained only once throughout the entire causal discovery procedure, substantially accelerating causal discovery. According to numerical experiments, FMCIT effectively controls type-I error and maintains high testing power under the alternative hypothesis, even in the presence of high-dimensional conditioning sets. In addition, we further integrate FMCIT into a two-stage guided PC skeleton learning framework, termed GPC-FMCIT, which combines fast screening with guided, budgeted refinement using FMCIT. This design yields explicit bounds on the number of CI queries while maintaining high statistical power. Experiments on synthetic and real-world causal discovery tasks demonstrate favorable accuracy-efficiency trade-offs over existing CI testing methods and PC variants.
翻译:基于约束的因果发现方法需要进行大量条件独立性(CI)检验,由于其计算复杂度高,严重限制了实际应用。因此,设计一种加速每个独立检验的算法至关重要。为此,我们提出了基于流匹配的条件独立性检验(FMCIT)。该检验利用了流匹配的高计算效率,并且在整个因果发现过程中模型仅需训练一次,从而显著加速了因果发现过程。数值实验表明,即使在存在高维条件集的情况下,FMCIT也能有效控制第一类错误,并在备择假设下保持较高的检验效能。此外,我们进一步将FMCIT集成到一个两阶段引导式PC骨架学习框架中,称为GPC-FMCIT。该框架结合了快速筛选与使用FMCIT进行引导式、预算化精炼的策略。这一设计在保持高统计效能的同时,对CI查询次数给出了明确的界限。在合成及真实世界因果发现任务上的实验表明,相较于现有的CI检验方法及PC变体,该方法在准确性与效率之间取得了更优的权衡。