In federated learning (FL), weighted aggregation of local models is conducted to generate a global model, and the aggregation weights are normalized (the sum of weights is 1) and proportional to the local data sizes. In this paper, we revisit the weighted aggregation process and gain new insights into the training dynamics of FL. First, we find that the sum of weights can be smaller than 1, causing global weight shrinking effect (analogous to weight decay) and improving generalization. We explore how the optimal shrinking factor is affected by clients' data heterogeneity and local epochs. Second, we dive into the relative aggregation weights among clients to depict the clients' importance. We develop client coherence to study the learning dynamics and find a critical point that exists. Before entering the critical point, more coherent clients play more essential roles in generalization. Based on the above insights, we propose an effective method for Federated Learning with Learnable Aggregation Weights, named as FedLAW. Extensive experiments verify that our method can improve the generalization of the global model by a large margin on different datasets and models.
翻译:在联邦学习中,通过对局部模型进行加权聚合以生成全局模型,聚合权重经过归一化处理(权重之和为1)并与局部数据大小成正比。本文重新审视了加权聚合过程,并获得了对联邦学习训练动态的新见解。首先,我们发现权重之和可以小于1,这会导致全局权重收缩效应(类似于权重衰减)并提升泛化能力。我们探讨了最优收缩因子如何受客户端数据异质性和局部训练轮次的影响。其次,我们深入研究了客户端之间的相对聚合权重,以刻画客户端的重要性。我们提出了客户端一致性概念来研究学习动态,并发现存在一个临界点。在进入临界点之前,一致性较高的客户端在泛化中发挥更重要的作用。基于上述见解,我们提出了一种名为FedLAW的带可学习聚合权重的联邦学习有效方法。大量实验证明,我们的方法能够在不同数据集和模型上显著提升全局模型的泛化能力。