Federated learning (FL) addresses data privacy concerns by enabling collaborative training of AI models across distributed data owners. Wide adoption of FL faces the fundamental challenges of data heterogeneity and the large scale of data owners involved. In this paper, we investigate the prospect of Transformer-based FL models for achieving generalization and personalization in this setting. We conduct extensive comparative experiments involving FL with Transformers, ResNet, and personalized ResNet-based FL approaches under various scenarios. These experiments consider varying numbers of data owners to demonstrate Transformers' advantages over deep neural networks in large-scale heterogeneous FL tasks. In addition, we analyze the superior performance of Transformers by comparing the Centered Kernel Alignment (CKA) representation similarity across different layers and FL models to gain insight into the reasons behind their promising capabilities.
翻译:联邦学习(FL)通过支持跨分布式数据拥有者协同训练AI模型来解决数据隐私问题。广泛采用联邦学习面临数据异构性和数据拥有者大规模参与的根本性挑战。本文研究了基于Transformer的联邦学习模型在此场景下实现泛化与个性化的前景。我们开展了大量对比实验,涉及不同场景下采用Transformer、ResNet及基于ResNet个性化联邦学习方法的联邦学习系统。这些实验考虑了数据拥有者数量的变化,以证明Transformer在异构大规模联邦学习任务中相比深度神经网络的优越性。此外,我们通过比较不同层与联邦学习模型的中心核对齐(CKA)表示相似性,分析了Transformer的卓越性能,从而深入理解其具备良好能力的内在原因。