Federated learning (FL) is an emerging machine learning paradigm that allows multiple parties to train a shared model collaboratively in a privacy-preserving manner. Existing horizontal FL methods generally assume that the FL server and clients hold the same model structure. However, due to system heterogeneity and the need for personalization, enabling clients to hold models with diverse structures has become an important direction. Existing model-heterogeneous FL approaches often require publicly available datasets and incur high communication and/or computational costs, which limit their performances. To address these limitations, we propose the Federated Global prediction Header (FedGH) approach. It is a communication and computation-efficient model-heterogeneous FL framework which trains a shared generalized global prediction header with representations extracted by heterogeneous extractors for clients' models at the FL server. The trained generalized global prediction header learns from different clients. The acquired global knowledge is then transferred to clients to substitute each client's local prediction header. We derive the non-convex convergence rate of FedGH. Extensive experiments on two real-world datasets demonstrate that FedGH achieves significantly more advantageous performance in both model-homogeneous and -heterogeneous FL scenarios compared to seven state-of-the-art personalized FL models, beating the best-performing baseline by up to 8.87% (for model-homogeneous FL) and 1.83% (for model-heterogeneous FL) in terms of average test accuracy, while saving up to 85.53% of communication overhead.
翻译:联邦学习作为一种新兴的机器学习范式,允许多方以隐私保护方式协作训练共享模型。现有水平联邦学习方法通常假设服务器与客户端持有相同模型结构,然而系统异构性与个性化需求使得允许客户端持有多样化结构的模型成为重要研究方向。现有模型异构联邦学习方法通常需要公开数据集,且存在通信/计算成本高的问题,限制了其性能。为解决上述局限,我们提出联邦全局预测头部(FedGH)方法。这是一种通信与计算高效的异构联邦学习框架,通过在联邦学习服务器端利用异构特征提取器提取的表示,训练一个共享的广义全局预测头部。该广义全局预测头部从不同客户端中学习,随后将习得的全局知识迁移至客户端,替代各客户端的本地预测头部。我们推导了FedGH的非凸收敛速率。在两个真实数据集上的大量实验表明,与七种最先进的个性化联邦学习模型相比,FedGH在模型同构与异构联邦学习场景中均取得显著更优性能:在模型同构场景中平均测试准确率最高提升8.87%,在异构场景中最高提升1.83%,同时通信开销最高减少85.53%。