In federated learning (FL), weighted aggregation of local models is conducted to generate a global model, and the aggregation weights are normalized (the sum of weights is 1) and proportional to the local data sizes. In this paper, we revisit the weighted aggregation process and gain new insights into the training dynamics of FL. First, we find that the sum of weights can be smaller than 1, causing global weight shrinking effect (analogous to weight decay) and improving generalization. We explore how the optimal shrinking factor is affected by clients' data heterogeneity and local epochs. Second, we dive into the relative aggregation weights among clients to depict the clients' importance. We develop client coherence to study the learning dynamics and find a critical point that exists. Before entering the critical point, more coherent clients play more essential roles in generalization. Based on the above insights, we propose an effective method for Federated Learning with Learnable Aggregation Weights, named as FedLAW. Extensive experiments verify that our method can improve the generalization of the global model by a large margin on different datasets and models.
翻译:在联邦学习中,通过加权聚合局部模型来生成全局模型,聚合权重经过归一化(权重之和为1)且与局部数据规模成正比。本文重新审视了加权聚合过程,并对联邦学习的训练动态获得了新见解。首先,我们发现权重之和可以小于1,这会导致全局权重缩小效应(类似于权重衰减)并提升泛化性能。我们探究了最优缩小因子如何受到客户端数据异质性和本地训练轮次的影响。其次,我们深入分析了客户端之间的相对聚合权重以刻画其重要性,并提出了客户端一致性概念来研究学习动态,发现存在一个临界点。在进入临界点之前,一致性更高的客户端在泛化中发挥更关键的作用。基于上述见解,我们提出了一种名为FedLAW的有效方法,即可学习聚合权重的联邦学习。大量实验证明,我们的方法能够显著提升全局模型在不同数据集和模型上的泛化性能。