Graph neural networks (GNN) have been widely deployed in real-world networked applications and systems due to their capability to handle graph-structured data. However, the growing awareness of data privacy severely challenges the traditional centralized model training paradigm, where a server holds all the graph information. Federated learning is an emerging collaborative computing paradigm that allows model training without data centralization. Existing federated GNN studies mainly focus on systems where clients hold distinctive graphs or sub-graphs. The practical node-level federated situation, where each client is only aware of its direct neighbors, has yet to be studied. In this paper, we propose the first federated GNN framework called Lumos that supports supervised and unsupervised learning with feature and degree protection on node-level federated graphs. We first design a tree constructor to improve the representation capability given the limited structural information. We further present a Monte Carlo Markov Chain-based algorithm to mitigate the workload imbalance caused by degree heterogeneity with theoretically-guaranteed performance. Based on the constructed tree for each client, a decentralized tree-based GNN trainer is proposed to support versatile training. Extensive experiments demonstrate that Lumos outperforms the baseline with significantly higher accuracy and greatly reduced communication cost and training time.
翻译:图神经网络(GNN)因其处理图结构数据的能力,已被广泛部署于实际网络化应用与系统中。然而,日益增长的数据隐私意识严重挑战了传统的集中式模型训练范式——即服务器持有全部图信息。联邦学习作为一种新兴的协作计算范式,允许在不集中数据的情况下进行模型训练。现有联邦GNN研究主要聚焦于客户端持有独立图或子图的系统,而实际中每个客户端仅知晓其直接邻居的节点级联邦场景尚未得到充分研究。本文提出首个支持监督与无监督学习的联邦GNN框架Lumos,该框架可在节点级联邦图中实现特征与度数的隐私保护。我们首先设计了一种树形构建器,在有限结构信息下提升表征能力;随后提出基于马尔可夫链蒙特卡洛的算法,通过理论保证的性能缓解度数异质性导致的工作负载不均衡问题。基于为每个客户端构建的树结构,我们进一步提出去中心化的树基GNN训练器以支持多样化训练。大量实验表明,Lumos在显著提升准确率的同时大幅降低了通信开销与训练时间,性能全面优于基线方法。