Federated learning (FL) triggers intra-client and inter-client class imbalance, with the latter compared to the former leading to biased client updates and thus deteriorating the distributed models. Such a bias is exacerbated during the server aggregation phase and has yet to be effectively addressed by conventional re-balancing methods. To this end, different from the off-the-shelf label or loss-based approaches, we propose a gradient alignment (GA)-informed FL method, dubbed as FedGA, where the importance of error asymmetry (EA) in bias is observed and its linkage to the gradient of the loss to raw logits is explored. Concretely, GA, implemented by label calibration during the model backpropagation process, prevents catastrophic forgetting of rate and missing classes, hence boosting model convergence and accuracy. Experimental results on five benchmark datasets demonstrate that GA outperforms the pioneering counterpart FedAvg and its four variants in minimizing EA and updating bias, and accordingly yielding higher F1 score and accuracy margins when the Dirichlet distribution sampling factor $\alpha$ increases. The code and more details are available at \url{https://anonymous.4open.science/r/FedGA-B052/README.md}.
翻译:联邦学习(FL)会引发客户端内部与客户端之间的类别不平衡,其中后者相较于前者会导致客户端更新产生偏差,进而降低分布式模型的性能。这种偏差在服务器聚合阶段会进一步加剧,而传统的再平衡方法尚未能有效解决此问题。为此,不同于现有的基于标签或损失函数的方法,我们提出了一种基于梯度对齐(GA)的联邦学习方法,称为FedGA。该方法关注到误差不对称性(EA)在偏差中的重要性,并探究了其与损失函数对原始逻辑值的梯度之间的关联。具体而言,GA通过在模型反向传播过程中进行标签校准,防止对稀有类别和缺失类别的灾难性遗忘,从而提升模型收敛速度与精度。在五个基准数据集上的实验结果表明,当狄利克雷分布采样因子$\alpha$增大时,GA在最小化EA和更新偏差方面优于开创性方法FedAvg及其四种变体,并相应地获得了更高的F1分数和准确率提升。代码及更多细节详见\url{https://anonymous.4open.science/r/FedGA-B052/README.md}。