Federated learning (FL) is a promising framework for privacy-preserving collaborative learning, where model training tasks are distributed to clients and only the model updates need to be collected at a server. However, when being deployed at mobile edge networks, clients may have unpredictable availability and drop out of the training process, which hinders the convergence of FL. This paper tackles such a critical challenge. Specifically, we first investigate the convergence of the classical FedAvg algorithm with arbitrary client dropouts. We find that with the common choice of a decaying learning rate, FedAvg oscillates around a stationary point of the global loss function, which is caused by the divergence between the aggregated and desired central update. Motivated by this new observation, we then design a novel training algorithm named MimiC, where the server modifies each received model update based on the previous ones. The proposed modification of the received model updates mimics the imaginary central update irrespective of dropout clients. The theoretical analysis of MimiC shows that divergence between the aggregated and central update diminishes with proper learning rates, leading to its convergence. Simulation results further demonstrate that MimiC maintains stable convergence performance and learns better models than the baseline methods.
翻译:联邦学习(FL)是一种具有隐私保护协作学习前景的框架,其中模型训练任务被分发至客户端,仅需在服务器端收集模型更新。然而,当部署于移动边缘网络时,客户端可能出现不可预测的可用性中断并退出训练过程,这阻碍了FL的收敛。本文致力于解决这一关键挑战。具体而言,我们首先研究了经典FedAvg算法在任意客户端丢失情况下的收敛性。研究发现,在采用常见的衰减学习率时,FedAvg会在全局损失函数的驻点附近振荡,这是由聚合更新与期望中央更新之间的偏差所致。基于这一新发现,我们设计了一种名为MimiC的新型训练算法,其中服务器根据之前的更新对每个接收到的模型更新进行修正。所提出的接收模型更新修正方法能模拟虚拟的中央更新,无论是否存在丢失客户端。MimiC的理论分析表明,通过适当的学习率,聚合更新与中央更新之间的偏差会逐渐消失,从而实现收敛。仿真结果进一步证明,MimiC保持稳定的收敛性能,且学习到的模型优于基线方法。