Federated Learning (FL) has emerged as a promising solution to perform deep learning on different data owners without exchanging raw data. However, non-IID data has been a key challenge in FL, which could significantly degrade the accuracy of the final model. Among different non-IID types, label skews have been challenging and common in image classification and other tasks. Instead of averaging the local models in most previous studies, we propose FedConcat, a simple and effective approach that concatenates these local models as the base of the global model to effectively aggregate the local knowledge. To reduce the size of the global model, we adopt the clustering technique to group the clients by their label distributions and collaboratively train a model inside each cluster. We theoretically analyze the advantage of concatenation over averaging by analyzing the information bottleneck of deep neural networks. Experimental results demonstrate that FedConcat achieves significantly higher accuracy than previous state-of-the-art FL methods in various heterogeneous label skew distribution settings and meanwhile has lower communication costs. Our code is publicly available.
翻译:联邦学习(FL)已成为一种有前景的解决方案,可在不交换原始数据的情况下对不同数据拥有者执行深度学习。然而,非独立同分布(non-IID)数据是FL中的关键挑战,会显著降低最终模型的准确率。在不同类型的非IID数据中,标签倾斜在图像分类等任务中尤为常见且具有挑战性。与以往多数研究中直接平均局部模型的方法不同,我们提出FedConcat——一种简单有效的方案,通过拼接这些局部模型作为全局模型的基础,有效聚合局部知识。为减小全局模型规模,我们采用聚类技术根据标签分布对客户端进行分组,并在每个聚类内协同训练模型。我们从理论上通过分析深度神经网络的信息瓶颈,论证了拼接相较于平均的优势。实验结果表明,FedConcat在各种异构标签倾斜分布设置下,准确率显著优于现有最先进的FL方法,同时通信成本更低。我们的代码已公开。