This paper tackles a critical bottleneck in Super-Structure-based divide-and-conquer causal discovery: the high computational cost of constructing accurate Super-Structures--particularly when conditional independence (CI) tests are expensive and domain knowledge is unavailable. We propose a novel, lightweight framework that relaxes the strict requirements on Super-Structure construction while preserving the algorithmic benefits of divide-and-conquer. By integrating weakly constrained Super-Structures with efficient graph partitioning and merging strategies, our approach substantially lowers CI test overhead without sacrificing accuracy. We instantiate the framework in a concrete causal discovery algorithm and rigorously evaluate its components on synthetic data. Comprehensive experiments on Gaussian Bayesian networks, including magic-NIAB, ECOLI70, and magic-IRRI, demonstrate that our method matches or closely approximates the structural accuracy of PC and FCI while drastically reducing the number of CI tests. Further validation on the real-world China Health and Retirement Longitudinal Study (CHARLS) dataset confirms its practical applicability. Our results establish that accurate, scalable causal discovery is achievable even under minimal assumptions about the initial Super-Structure, opening new avenues for applying divide-and-conquer methods to large-scale, knowledge-scarce domains such as biomedical and social science research.
翻译:本文解决了基于超结构的分治因果发现中的一个关键瓶颈:构建精确超结构的高计算成本——尤其是在条件独立性检验代价高昂且领域知识不可用的情况下。我们提出了一种新颖的轻量级框架,该框架放宽了对超结构构建的严格要求,同时保留了分治算法的优势。通过将弱约束的超结构与高效的图分割及合并策略相结合,我们的方法在保持准确性的同时显著降低了条件独立性检验的开销。我们将该框架具体实现为一个因果发现算法,并在合成数据上对其各组件进行了严格评估。在高斯贝叶斯网络(包括magic-NIAB、ECOLI70和magic-IRRI)上的综合实验表明,我们的方法在结构准确性上与PC和FCI算法相当或非常接近,同时极大地减少了条件独立性检验的数量。在中国健康与养老追踪调查真实数据集上的进一步验证证实了其实际适用性。我们的结果表明,即使在对初始超结构做出最小假设的情况下,也能实现准确、可扩展的因果发现,这为将分治方法应用于生物医学和社会科学研究等大规模、知识稀缺的领域开辟了新途径。