While data is distributed in multiple edge devices, Federated Learning (FL) is attracting more and more attention to collaboratively train a machine learning model without transferring raw data. FL generally exploits a parameter server and a large number of edge devices during the whole process of the model training, while several devices are selected in each round. However, straggler devices may slow down the training process or even make the system crash during training. Meanwhile, other idle edge devices remain unused. As the bandwidth between the devices and the server is relatively low, the communication of intermediate data becomes a bottleneck. In this paper, we propose Time-Efficient Asynchronous federated learning with Sparsification and Quantization, i.e., TEASQ-Fed. TEASQ-Fed can fully exploit edge devices to asynchronously participate in the training process by actively applying for tasks. We utilize control parameters to choose an appropriate number of parallel edge devices, which simultaneously execute the training tasks. In addition, we introduce a caching mechanism and weighted averaging with respect to model staleness to further improve the accuracy. Furthermore, we propose a sparsification and quantitation approach to compress the intermediate data to accelerate the training. The experimental results reveal that TEASQ-Fed improves the accuracy (up to 16.67% higher) while accelerating the convergence of model training (up to twice faster).
翻译:数据分布于多个边缘设备时,联邦学习(Federated Learning, FL)因无需传输原始数据即可协作训练机器学习模型而备受关注。FL通常利用参数服务器与大量边缘设备协同完成模型训练全过程,每轮训练仅选取部分设备参与。然而,滞后设备(straggler devices)可能拖慢训练进程,甚至导致系统崩溃,而其他空闲边缘设备则未被充分利用。由于设备与服务器之间的带宽相对较低,中间数据的通信成为瓶颈。本文提出一种基于稀疏化与量化的高效异步联邦学习算法TEASQ-Fed(Time-Efficient Asynchronous federated learning with Sparsification and Quantization)。TEASQ-Fed通过设备主动申请任务的方式,充分利用边缘设备异步参与训练过程。我们引入控制参数动态选择并行边缘设备数量,使其同步执行训练任务;同时设计缓存机制与基于模型陈旧度的加权平均策略以进一步提升精度。此外,本文提出一种稀疏化与量化方法压缩中间数据,从而加速训练过程。实验结果表明,TEASQ-Fed在提升精度(最高提升16.67%)的同时,将模型训练收敛速度提升至两倍。