FedGTA: Topology-aware Averaging for Federated Graph Learning

Federated Graph Learning (FGL) is a distributed machine learning paradigm that enables collaborative training on large-scale subgraphs across multiple local systems. Existing FGL studies fall into two categories: (i) FGL Optimization, which improves multi-client training in existing machine learning models; (ii) FGL Model, which enhances performance with complex local models and multi-client interactions. However, most FGL optimization strategies are designed specifically for the computer vision domain and ignore graph structure, presenting dissatisfied performance and slow convergence. Meanwhile, complex local model architectures in FGL Models studies lack scalability for handling large-scale subgraphs and have deployment limitations. To address these issues, we propose Federated Graph Topology-aware Aggregation (FedGTA), a personalized optimization strategy that optimizes through topology-aware local smoothing confidence and mixed neighbor features. During experiments, we deploy FedGTA in 12 multi-scale real-world datasets with the Louvain and Metis split. This allows us to evaluate the performance and robustness of FedGTA across a range of scenarios. Extensive experiments demonstrate that FedGTA achieves state-of-the-art performance while exhibiting high scalability and efficiency. The experiment includes ogbn-papers100M, the most representative large-scale graph database so that we can verify the applicability of our method to large-scale graph learning. To the best of our knowledge, our study is the first to bridge large-scale graph learning with FGL using this optimization strategy, contributing to the development of efficient and scalable FGL methods.

翻译：联邦图学习（FGL）是一种分布式机器学习范式，支持在多个本地系统上对大规模子图进行协同训练。现有FGL研究分为两类：（i）FGL优化——改进现有机器学习模型中的多客户端训练；（ii）FGL模型——通过复杂本地模型与多客户端交互提升性能。然而，多数FGL优化策略专为计算机视觉领域设计，忽略了图结构，导致性能不佳且收敛缓慢。同时，FGL模型研究中的复杂本地模型架构缺乏处理大规模子图的可扩展性，且存在部署限制。为解决这些问题，我们提出联邦图拓扑感知聚合（FedGTA）——一种通过拓扑感知局部平滑置信度与混合邻居特征进行优化的个性化优化策略。实验中，我们在采用Louvain和Metis分割的12个多尺度真实数据集上部署FedGTA，全面评估其在多种场景下的性能与鲁棒性。大量实验表明，FedGTA在实现最先进性能的同时，展现出高可扩展性与效率。实验包含最具代表性的大规模图数据库ogbn-papers100M，由此验证了本方法在大规模图学习中的适用性。据我们所知，本研究首次利用该优化策略架起大规模图学习与FGL之间的桥梁，为开发高效可扩展的FGL方法做出了贡献。