Learning causal structures from observational data remains a fundamental yet computationally intensive task, particularly in high-dimensional settings where existing methods face challenges such as the super-exponential growth of the search space and increasing computational demands. To address this, we introduce VISTA (Voting-based Integration of Subgraph Topologies for Acyclicity), a modular framework that decomposes the global causal structure learning problem into local subgraphs based on Markov Blankets. The global integration is achieved through a weighted voting mechanism that penalizes low-support edges via exponential decay, filters unreliable ones with an adaptive threshold, and ensures acyclicity using a Feedback Arc Set (FAS) algorithm. The framework is model-agnostic, imposing no assumptions on the inductive biases of base learners, is compatible with arbitrary data settings without requiring specific structural forms, and fully supports parallelization. We also theoretically establish finite-sample error bounds for VISTA, and prove its asymptotic consistency under mild conditions. Extensive experiments on both synthetic and real datasets consistently demonstrate the effectiveness of VISTA, yielding notable improvements in both accuracy and efficiency over a wide range of base learners.
翻译:从观测数据中学习因果结构仍然是一项基础但计算密集的任务,尤其是在高维场景下,现有方法面临搜索空间超指数增长和计算需求增加等挑战。为解决此问题,我们提出了VISTA(基于投票的无环性子图拓扑集成框架),这是一个模块化框架,将全局因果结构学习问题基于马尔可夫毯分解为局部子图。全局集成通过加权投票机制实现,该机制通过指数衰减惩罚低支持度边,使用自适应阈值过滤不可靠边,并利用反馈弧集算法确保无环性。该框架与模型无关,不对基础学习器的归纳偏置施加任何假设,兼容任意数据设置而无需特定结构形式,并完全支持并行化。我们还从理论上建立了VISTA的有限样本误差界,并在温和条件下证明了其渐近一致性。在合成和真实数据集上的大量实验一致证明了VISTA的有效性,相较于多种基础学习器,其在准确性和效率方面均取得了显著提升。