Federated learning has been identified as an efficient decentralized training paradigm for scaling the machine learning model training on a large number of devices while guaranteeing the data privacy of the trainers. FedAvg has become a foundational parameter update strategy for federated learning, which has been promising to eliminate the effect of the heterogeneous data across clients and guarantee convergence. However, the synchronization parameter update barriers for each communication round during the training significant time on waiting, slowing down the training procedure. Therefore, recent state-of-the-art solutions propose using semi-asynchronous approaches to mitigate the waiting time cost with guaranteed convergence. Nevertheless, emerging semi-asynchronous approaches are unable to eliminate the waiting time completely. We propose a full asynchronous training paradigm, called FedFa, which can guarantee model convergence and eliminate the waiting time completely for federated learning by using a few buffered results on the server for parameter updating. Further, we provide theoretical proof of the convergence rate for our proposed FedFa. Extensive experimental results indicate our approach effectively improves the training performance of federated learning by up to 6x and 4x speedup compared to the state-of-the-art synchronous and semi-asynchronous strategies while retaining high accuracy in both IID and Non-IID scenarios.
翻译:联邦学习已被视为一种高效的分布式训练范式,可在大量设备上扩展机器学习模型训练,同时保障训练者的数据隐私。FedAvg已成为联邦学习的基础参数更新策略,有望消除客户间异质数据的影响并保证收敛性。然而,每轮通信中的同步参数更新屏障会导致大量等待时间,从而减慢训练过程。因此,近期最先进的解决方案提出采用半异步方法来减少等待时间开销,同时保证收敛。然而,新兴的半异步方法仍无法完全消除等待时间。我们提出了一个名为FedFa的完全异步训练范式,通过使用服务器上少量缓冲结果进行参数更新,可在保证模型收敛的同时完全消除联邦学习的等待时间。此外,我们为提出的FedFa提供了收敛速率的理论证明。大量实验结果表明,我们的方法在IID和Non-IID场景下均能保持高精度的同时,相较于最先进的同步和半异步策略,有效将联邦学习的训练性能提升高达6倍和4倍的加速比。