Federated learning is an emerging technique for training deep models over decentralized clients without exposing private data, which however suffers from label distribution skew and usually results in slow convergence and degraded model performance. This challenge could be more serious when the participating clients are in unstable circumstances and dropout frequently. Previous work and our empirical observations demonstrate that the classifier head for classification task is more sensitive to label skew and the unstable performance of FedAvg mainly lies in the imbalanced training samples across different classes. The biased classifier head will also impact the learning of feature representations. Therefore, maintaining a balanced classifier head is of significant importance for building a better global model. To tackle this issue, we propose a simple yet effective framework by introducing a prior-calibrated softmax function for computing the cross-entropy loss and a prototype-based feature augmentation scheme to re-balance the local training, which are lightweight for edge devices and can facilitate the global model aggregation. With extensive experiments performed on FashionMNIST and CIFAR-10 datasets, we demonstrate the improved model performance of our method over existing baselines in the presence of non-IID data and client dropout.
翻译:联邦学习是一种新兴的分布式深度模型训练技术,无需暴露私有数据即可在去中心化客户端上进行训练,但该技术受标签分布偏移影响,通常导致收敛速度缓慢及模型性能下降。当参与客户端处于不稳定环境并频繁掉线时,这一挑战可能更加严峻。前人研究及我们的实验观测表明,分类任务中的分类器头部对标签偏移更为敏感,且联邦平均算法(FedAvg)性能不稳定的根源主要在于不同类别的训练样本不均衡。偏置的分类器头部还会影响特征表示的学习。因此,维护均衡的分类器头部对于构建更优的全局模型至关重要。为解决此问题,我们提出一种简洁高效的框架,引入先验校准的Softmax函数计算交叉熵损失,并采用基于原型的特征增强方案重新平衡本地训练。这些方法对边缘设备轻量友好,且能促进全局模型聚合。通过在FashionMNIST与CIFAR-10数据集上的大量实验,我们证明了本方法在非独立同分布数据及客户端丢失场景下,相较于现有基线方法具有更优的模型性能。