Federated learning is a paradigm of joint learning in which clients collaborate by sharing model parameters instead of data. However, in the non-iid setting, the global model experiences client drift, which can seriously affect the final performance of the model. Previous methods tend to correct the global model that has already deviated based on the loss function or gradient, overlooking the impact of the client samples. In this paper, we rethink the role of the client side and propose Federated Balanced Learning, i.e., FBL, to prevent this issue from the beginning through sample balance on the client side. Technically, FBL allows unbalanced data on the client side to achieve sample balance through knowledge filling and knowledge sampling using edge-side generation models, under the limitation of a fixed number of data samples on clients. Furthermore, we design a Knowledge Alignment Strategy to bridge the gap between synthetic and real data, and a Knowledge Drop Strategy to regularize our method. Meanwhile, we scale our method to real and complex scenarios, allowing different clients to adopt various methods, and extend our framework to further improve performance. Numerous experiments show that our method outperforms state-of-the-art baselines. The code is released upon acceptance.
翻译:联邦学习是一种联合学习范式,其中客户端通过共享模型参数而非数据来进行协作。然而,在非独立同分布设置下,全局模型会经历客户端漂移,这可能严重影响模型的最终性能。先前的方法倾向于基于损失函数或梯度来校正已经发生偏离的全局模型,而忽视了客户端样本的影响。在本文中,我们重新思考了客户端的作用,并提出了联邦平衡学习,即FBL,旨在通过客户端的样本平衡从一开始就防止此问题的发生。从技术上讲,FBL允许客户端存在不平衡数据,在客户端数据样本数量固定的限制下,通过使用边缘侧生成模型进行知识填充和知识采样来实现样本平衡。此外,我们设计了一种知识对齐策略来弥合合成数据与真实数据之间的差距,以及一种知识丢弃策略来规范化我们的方法。同时,我们将我们的方法扩展到真实且复杂的场景中,允许不同的客户端采用各种方法,并扩展我们的框架以进一步提升性能。大量实验表明,我们的方法优于最先进的基线模型。代码将在论文被接受后发布。