Federated Learning (FL) is a powerful framework for privacy-preserving distributed learning. It enables multiple clients to collaboratively train a global model without sharing raw data. However, handling noisy labels in FL remains a major challenge due to heterogeneous data distributions and communication constraints, which can severely degrade model performance. To address this issue, we propose FedEFC, a novel method designed to tackle the impact of noisy labels in FL. FedEFC mitigates this issue through two key techniques: (1) prestopping, which prevents overfitting to mislabeled data by dynamically halting training at an optimal point, and (2) loss correction, which adjusts model updates to account for label noise. In particular, we develop an effective loss correction tailored to the unique challenges of FL, including data heterogeneity and decentralized training. Furthermore, we provide a theoretical analysis, leveraging the composite proper loss property, to demonstrate that the FL objective function under noisy label distributions can be aligned with the clean label distribution. Extensive experimental results validate the effectiveness of our approach, showing that it consistently outperforms existing FL techniques in mitigating the impact of noisy labels, particularly under heterogeneous data settings (e.g., achieving up to 41.64% relative performance improvement over the existing loss correction method).
翻译:联邦学习(FL)是一种强大的隐私保护分布式学习框架,允许多个客户端在不共享原始数据的情况下协作训练全局模型。然而,由于异构数据分布和通信限制,处理FL中的噪声标签仍然是一个重大挑战,这会严重降低模型性能。为解决这一问题,我们提出FedEFC,一种旨在应对FL中噪声标签影响的新方法。FedEFC通过两项关键技术缓解此问题:(1)预停止,通过动态在最优点停止训练来防止对误标数据的过拟合;(2)损失校正,通过调整模型更新以考虑标签噪声。特别地,我们开发了一种针对FL特有挑战(包括数据异构性和去中心化训练)定制的有效损失校正方法。此外,我们基于复合适当损失性质进行了理论分析,证明噪声标签分布下的FL目标函数可以与干净标签分布对齐。大量实验结果验证了我们方法的有效性,表明其在减轻噪声标签影响方面持续优于现有FL技术,尤其是在异构数据设置下(例如,相对于现有损失校正方法实现了高达41.64%的相对性能提升)。