Federated learning (FL) addresses data privacy concerns by enabling collaborative training of AI models across distributed data owners. Wide adoption of FL faces the fundamental challenges of data heterogeneity and the large scale of data owners involved. In this paper, we investigate the prospect of Transformer-based FL models for achieving generalization and personalization in this setting. We conduct extensive comparative experiments involving FL with Transformers, ResNet, and personalized ResNet-based FL approaches under various scenarios. These experiments consider varying numbers of data owners to demonstrate Transformers' advantages over deep neural networks in large-scale heterogeneous FL tasks. In addition, we analyze the superior performance of Transformers by comparing the Centered Kernel Alignment (CKA) representation similarity across different layers and FL models to gain insight into the reasons behind their promising capabilities.
翻译:联邦学习(Federated Learning, FL)通过使分布式数据所有者协同训练人工智能模型,解决了数据隐私问题。然而,FL的广泛部署面临数据异构性及涉及大量数据所有者这两大根本性挑战。本文研究了基于Transformer的FL模型在此场景下实现泛化与个性化能力的潜力。我们开展了大量对比实验,涵盖多种场景下基于Transformer、ResNet及个性化ResNet的FL方法。这些实验通过考虑不同数量的数据所有者,证明了Transformer在大规模异构FL任务中相较于深度神经网络的优势。此外,我们通过对比不同层及FL模型间的中心核对齐(Centered Kernel Alignment, CKA)表征相似性,分析了Transformer的卓越性能,从而深入理解其强大能力背后的原因。