Federated Learning (FL) offers a collaborative training framework, allowing multiple clients to contribute to a shared model without compromising data privacy. Due to the heterogeneous nature of local datasets, updated client models may overfit and diverge from one another, commonly known as the problem of client drift. In this paper, we propose FedBug (Federated Learning with Bottom-Up Gradual Unfreezing), a novel FL framework designed to effectively mitigate client drift. FedBug adaptively leverages the client model parameters, distributed by the server at each global round, as the reference points for cross-client alignment. Specifically, on the client side, FedBug begins by freezing the entire model, then gradually unfreezes the layers, from the input layer to the output layer. This bottom-up approach allows models to train the newly thawed layers to project data into a latent space, wherein the separating hyperplanes remain consistent across all clients. We theoretically analyze FedBug in a novel over-parameterization FL setup, revealing its superior convergence rate compared to FedAvg. Through comprehensive experiments, spanning various datasets, training conditions, and network architectures, we validate the efficacy of FedBug. Our contributions encompass a novel FL framework, theoretical analysis, and empirical validation, demonstrating the wide potential and applicability of FedBug.
翻译:联邦学习(FL)提供了一种协作训练框架,使多个客户端能够在不损害数据隐私的前提下共同训练共享模型。由于本地数据集的异质性,更新的客户端模型可能产生过拟合并彼此偏离(即客户端漂移问题)。本文提出FedBug(基于自底向上渐进解冻的联邦学习)——一种有效缓解客户端漂移的新型FL框架。FedBug自适应地将服务器每轮全局通信分发的客户端模型参数作为跨客户端对齐的参考点。具体而言,在客户端侧,FedBug首先冻结整个模型,然后从输入层到输出层逐步解冻各层。这种自底向上的方法使模型能够训练新解冻的层,将数据投影到潜在空间中,在此空间中所有客户端的分离超平面保持一致。我们在新型过参数化FL设置下对FedBug进行了理论分析,揭示了其相较FedAvg具有更优的收敛速率。通过涵盖不同数据集、训练条件和网络架构的综合实验,我们验证了FedBug的有效性。本文贡献包括新型FL框架、理论分析与实证验证,充分展现了FedBug的广泛潜力与适用性。