One global model in federated learning (FL) might not be sufficient to serve many clients with non-IID tasks and distributions. While there has been advances in FL to train multiple global models for better personalization, they only provide limited choices to clients so local finetuning is still indispensable. In this paper, we propose a novel ``FedMerge'' approach that can create a personalized model per client by simply merging multiple global models with automatically optimized and customized weights. In FedMerge, a few global models can serve many non-IID clients, even without further local finetuning. We formulate this problem as a joint optimization of global models and the merging weights for each client. Unlike existing FL approaches where the server broadcasts one or multiple global models to all clients, the server only needs to send a customized, merged model to each client. Moreover, instead of periodically interrupting the local training and re-initializing it to a global model, the merged model aligns better with each client's task and data distribution, smoothening the local-global gap between consecutive rounds caused by client drift. We evaluate FedMerge on three different non-IID settings applied to different domains with diverse tasks and data types, in which FedMerge consistently outperforms existing FL approaches, including clustering-based and mixture-of-experts (MoE) based methods.
翻译:在联邦学习(FL)中,单一的全局模型可能不足以服务于具有非独立同分布任务和数据分布的众多客户端。尽管现有研究已通过训练多个全局模型来提升个性化效果,但这些方法仅为客户端提供有限的选择,因此本地微调仍然不可或缺。本文提出了一种新颖的“FedMerge”方法,该方法能够通过将多个全局模型与自动优化且定制化的权重进行融合,为每个客户端生成个性化模型。在FedMerge框架中,仅需少量全局模型即可服务大量非独立同分布客户端,甚至无需进一步的本地微调。我们将该问题形式化为全局模型与各客户端融合权重的联合优化问题。与现有联邦学习方法中服务器向所有客户端广播一个或多个全局模型不同,FedMerge仅需向每个客户端发送定制化的融合模型。此外,相较于定期中断本地训练并将其重新初始化为全局模型的传统方式,融合模型能更好地适配每个客户端的任务和数据分布,从而缓解因客户端漂移导致的连续训练轮次间的本地-全局差异。我们在应用于不同领域、任务及数据类型的三种非独立同分布场景下评估FedMerge,结果表明其性能始终优于现有联邦学习方法,包括基于聚类和基于专家混合(MoE)的方法。