Federated Learning (FL) models often experience client drift caused by heterogeneous data, where the distribution of data differs across clients. To address this issue, advanced research primarily focuses on manipulating the existing gradients to achieve more consistent client models. In this paper, we present an alternative perspective on client drift and aim to mitigate it by generating improved local models. First, we analyze the generalization contribution of local training and conclude that this generalization contribution is bounded by the conditional Wasserstein distance between the data distribution of different clients. Then, we propose FedImpro, to construct similar conditional distributions for local training. Specifically, FedImpro decouples the model into high-level and low-level components, and trains the high-level portion on reconstructed feature distributions. This approach enhances the generalization contribution and reduces the dissimilarity of gradients in FL. Experimental results show that FedImpro can help FL defend against data heterogeneity and enhance the generalization performance of the model.
翻译:联邦学习(FL)模型常因数据异质性(不同客户端的数据分布存在差异)而出现客户端漂移问题。为应对该问题,前沿研究主要侧重于操控现有梯度以获取更一致的客户端模型。本文从客户端漂移的新视角出发,通过生成更优的局部模型来缓解该问题。首先,我们分析了局部训练的泛化贡献,并得出结论:该泛化贡献受限于不同客户端数据分布之间的条件Wasserstein距离。进而提出FedImpro方法,通过构建相似的局部训练条件分布来解决问题。具体而言,FedImpro将模型解耦为高阶与低阶组件,并在重构的特征分布上训练高阶部分。该方法能增强泛化贡献并降低FL梯度的非相似性。实验结果表明,FedImpro可有效帮助FL抵御数据异质性,提升模型的泛化性能。