Federated learning algorithms, such as FedAvg, are negatively affected by data heterogeneity and partial client participation. To mitigate the latter problem, global variance reduction methods, like FedVARP, leverage stale model updates for non-participating clients. These methods are effective under homogeneous client participation. Yet, this paper shows that, when some clients participate much less than others, aggregating updates with different levels of staleness can detrimentally affect the training process. Motivated by this observation, we introduce FedStale, a novel algorithm that updates the global model in each round through a convex combination of "fresh" updates from participating clients and "stale" updates from non-participating ones. By adjusting the weight in the convex combination, FedStale interpolates between FedAvg, which only uses fresh updates, and FedVARP, which treats fresh and stale updates equally. Our analysis of FedStale convergence yields the following novel findings: i) it integrates and extends previous FedAvg and FedVARP analyses to heterogeneous client participation; ii) it underscores how the least participating client influences convergence error; iii) it provides practical guidelines to best exploit stale updates, showing that their usefulness diminishes as data heterogeneity decreases and participation heterogeneity increases. Extensive experiments featuring diverse levels of client data and participation heterogeneity not only confirm these findings but also show that FedStale outperforms both FedAvg and FedVARP in many settings.
翻译:联邦学习算法(如FedAvg)会因数据异构性和部分客户端参与而受到负面影响。为缓解后一个问题,全局方差缩减方法(如FedVARP)利用未参与客户端的陈旧模型更新。这些方法在客户端同质参与时有效。然而,本文表明,当某些客户端参与频率远低于其他客户端时,聚合不同陈旧程度的更新会损害训练过程。基于这一发现,我们提出FedStale——一种新颖的算法,通过参与客户端的“新鲜”更新与未参与客户端的“陈旧”更新的凸组合,每轮更新全局模型。通过调整凸组合中的权重,FedStale在仅使用新鲜更新的FedAvg与平等对待新鲜和陈旧更新的FedVARP之间进行插值。我们对FedStale收敛性的分析产生了以下新发现:i) 它整合并扩展了先前FedAvg和FedVARP在异构客户端参与下的分析;ii) 它强调了参与最少的客户端如何影响收敛误差;iii) 它提供了有效利用陈旧更新的实用指南,表明其效用随着数据异构性降低和参与异构性增加而减弱。涵盖不同程度客户端数据和参与异构性的大量实验不仅验证了这些发现,还表明FedStale在许多设置下优于FedAvg和FedVARP。