Conventional causal discovery methods rely on centralized data, which is inconsistent with the decentralized nature of data in many real-world situations. This discrepancy has motivated the development of federated causal discovery (FCD) approaches. However, existing FCD methods may be limited by their potentially restrictive assumptions of identifiable functional causal models or homogeneous data distributions, narrowing their applicability in diverse scenarios. In this paper, we propose a novel FCD method attempting to accommodate arbitrary causal models and heterogeneous data. We first utilize a surrogate variable corresponding to the client index to account for the data heterogeneity across different clients. We then develop a federated conditional independence test (FCIT) for causal skeleton discovery and establish a federated independent change principle (FICP) to determine causal directions. These approaches involve constructing summary statistics as a proxy of the raw data to protect data privacy. Owing to the nonparametric properties, FCIT and FICP make no assumption about particular functional forms, thereby facilitating the handling of arbitrary causal models. We conduct extensive experiments on synthetic and real datasets to show the efficacy of our method. The code is available at \url{https://github.com/lokali/FedCDH.git}.
翻译:传统因果发现方法依赖于集中式数据,这与现实场景中数据的分布式特性相矛盾。这一差异催生了联邦因果发现(FCD)方法的发展。然而,现有FCD方法受限于可识别函数因果模型或同质数据分布的强假设,限制了其在多样化场景中的应用。本文提出一种新型FCD方法,旨在兼容任意因果模型与异质数据。我们首先利用与客户端索引对应的代理变量来表征不同客户端间的数据异质性,随后开发了用于因果骨架发现的联邦条件独立性检验(FCIT),并建立联邦独立变化原则(FICP)以确定因果方向。这些方法通过构建原始数据的统计摘要来保护数据隐私。得益于非参数特性,FCIT与FICP无需对函数形式进行特定假设,从而支持任意因果模型的处理。我们在合成数据集与真实数据集上进行了大量实验,验证了方法的有效性。相关代码已发布于\url{https://github.com/lokali/FedCDH.git}。