Federated graph learning (FGL) has become an important research topic in response to the increasing scale and the distributed nature of graph-structured data in the real world. In FGL, a global graph is distributed across different clients, where each client holds a subgraph. Existing FGL methods often fail to effectively utilize cross-client edges, losing structural information during the training; additionally, local graphs often exhibit significant distribution divergence. These two issues make local models in FGL less desirable than in centralized graph learning, namely the local bias problem in this paper. To solve this problem, we propose a novel FGL framework to make the local models similar to the model trained in a centralized setting. Specifically, we design a distributed learning scheme, fully leveraging cross-client edges to aggregate information from other clients. In addition, we propose a label-guided sampling approach to alleviate the imbalanced local data and meanwhile, distinctly reduce the training overhead. Extensive experiments demonstrate that local bias can compromise the model performance and slow down the convergence during training. Experimental results also verify that our framework successfully mitigates local bias, achieving better performance than other baselines with lower time and memory overhead.
翻译:联邦图学习(FGL)已成为一个重要研究课题,以应对现实世界中图结构数据日益增长的规模与分布式特性。在FGL中,全局图被分布在不同客户端,每个客户端持有一个子图。现有FGL方法往往无法有效利用跨客户端边,在训练过程中丢失结构信息;此外,局部图常表现出显著的分布差异。这两个问题导致FGL中的局部模型性能不如集中式图学习,即本文所述的局部偏差问题。为解决此问题,我们提出一种新颖的FGL框架,使局部模型近似于集中式训练得到的模型。具体而言,我们设计了一种分布式学习方案,充分利用跨客户端边以聚合来自其他客户端的信息。此外,我们提出一种标签引导的采样方法,以缓解局部数据不平衡问题,同时显著降低训练开销。大量实验表明,局部偏差会损害模型性能并减缓训练收敛速度。实验结果也验证了我们的框架能成功缓解局部偏差,在更低的时间和内存开销下取得优于其他基线模型的性能。