Graphs are widely used to represent the relations among entities. When one owns the complete data, an entire graph can be easily built, therefore performing analysis on the graph is straightforward. However, in many scenarios, it is impractical to centralize the data due to data privacy concerns. An organization or party only keeps a part of the whole graph data, i.e., graph data is isolated from different parties. Recently, Federated Learning (FL) has been proposed to solve the data isolation issue, mainly for Euclidean data. It is still a challenge to apply FL on graph data because graphs contain topological information which is notorious for its non-IID nature and is hard to partition. In this work, we propose a novel FL framework for graph data, FedCog, to efficiently handle coupled graphs that are a kind of distributed graph data, but widely exist in a variety of real-world applications such as mobile carriers' communication networks and banks' transaction networks. We theoretically prove the correctness and security of FedCog. Experimental results demonstrate that our method FedCog significantly outperforms traditional FL methods on graphs. Remarkably, our FedCog improves the accuracy of node classification tasks by up to 14.7%.
翻译:图被广泛用于表示实体之间的关系。当拥有完整数据时,可以轻松构建整个图,从而对图进行分析。然而,在许多场景中,由于数据隐私问题,集中化数据并不可行。一个组织或参与方仅保存整个图数据的一部分,即图数据在不同参与方之间是隔离的。最近,联邦学习(FL)被提出用于解决数据隔离问题,主要针对欧几里得数据。由于图包含拓扑信息,且其数据具有非独立同分布(non-IID)特性,难以划分,因此将FL应用于图数据仍是一项挑战。本工作中,我们提出了一种新颖的图数据FL框架FedCog,用于高效处理耦合图。耦合图是一种分布式图数据,但广泛存在于各种实际应用中,例如移动运营商的通信网络和银行的交易网络。我们从理论上证明了FedCog的正确性和安全性。实验结果表明,我们的FedCog方法在图上显著优于传统FL方法。值得注意的是,我们的FedCog将节点分类任务的准确率提升了高达14.7%。