As a promising approach to deal with distributed data, Federated Learning (FL) achieves major advancements in recent years. FL enables collaborative model training by exploiting the raw data dispersed in multiple edge devices. However, the data is generally non-independent and identically distributed, i.e., statistical heterogeneity, and the edge devices significantly differ in terms of both computation and communication capacity, i.e., system heterogeneity. The statistical heterogeneity leads to severe accuracy degradation while the system heterogeneity significantly prolongs the training process. In order to address the heterogeneity issue, we propose an Asynchronous Staleness-aware Model Update FL framework, i.e., FedASMU, with two novel methods. First, we propose an asynchronous FL system model with a dynamical model aggregation method between updated local models and the global model on the server for superior accuracy and high efficiency. Then, we propose an adaptive local model adjustment method by aggregating the fresh global model with local models on devices to further improve the accuracy. Extensive experimentation with 6 models and 5 public datasets demonstrates that FedASMU significantly outperforms baseline approaches in terms of accuracy (0.60% to 23.90% higher) and efficiency (3.54% to 97.98% faster).
翻译:作为一种处理分布式数据的前景广阔的方法,联邦学习(Federated Learning, FL)近年来取得了重大进展。FL通过利用分散在多个边缘设备中的原始数据实现协同模型训练。然而,这些数据通常非独立同分布(即统计异质性),且边缘设备在计算和通信能力上存在显著差异(即系统异质性)。统计异质性会导致严重的精度下降,而系统异质性则会显著延长训练过程。为解决异质性问题,我们提出了一个异步过时感知模型更新联邦学习框架,即FedASMU,并引入了两种创新方法。首先,我们设计了一种异步FL系统模型,采用服务器端更新后的本地模型与全局模型之间的动态模型聚合方法,以实现更高的精度和效率。其次,我们提出了一种自适应本地模型调整方法,通过将最新的全局模型与设备上的本地模型聚合,进一步提升精度。基于6种模型和5个公共数据集的大量实验表明,FedASMU在精度(提升0.60%至23.90%)和效率(加速3.54%至97.98%)方面显著优于基线方法。