Causal discovery serves a pivotal role in mitigating model uncertainty through recovering the underlying causal mechanisms among variables. In many practical domains, such as healthcare, access to the data gathered by individual entities is limited, primarily for privacy and regulatory constraints. However, the majority of existing causal discovery methods require the data to be available in a centralized location. In response, researchers have introduced federated causal discovery. While previous federated methods consider distributed observational data, the integration of interventional data remains largely unexplored. We propose FedCDI, a federated framework for inferring causal structures from distributed data containing interventional samples. In line with the federated learning framework, FedCDI improves privacy by exchanging belief updates rather than raw samples. Additionally, it introduces a novel intervention-aware method for aggregating individual updates. We analyze scenarios with shared or disjoint intervened covariates, and mitigate the adverse effects of interventional data heterogeneity. The performance and scalability of FedCDI is rigorously tested across a variety of synthetic and real-world graphs.
翻译:因果发现通过恢复变量间的潜在因果机制,在降低模型不确定性方面发挥着关键作用。在医疗健康等许多实际应用领域,由于隐私保护和法规约束,个体机构收集的数据访问权限受限。然而,现有大多数因果发现方法要求数据能在集中式位置获取。为此,研究者提出了联邦因果发现框架。尽管现有联邦方法考虑了分布式观测数据,但干预性数据的整合仍鲜有探索。我们提出FedCDI框架,这是一个从包含干预样本的分布式数据中推断因果结构的联邦方法。遵循联邦学习框架,FedCDI通过交换信念更新而非原始样本来提升隐私保护。此外,该框架引入了一种新颖的干预感知聚合方法,用于整合个体更新。我们分析了共享或分离干预协变量的场景,并减轻了干预数据异质性带来的不利影响。通过多种合成图与真实世界图,对FedCDI的性能与可扩展性进行了严格测试。