Federated Learning (FL) is a collaborative machine learning (ML) framework that combines on-device training and server-based aggregation to train a common ML model among distributed agents. In this work, we propose an asynchronous FL design with periodic aggregation to tackle the straggler issue in FL systems. Considering limited wireless communication resources, we investigate the effect of different scheduling policies and aggregation designs on the convergence performance. Driven by the importance of reducing the bias and variance of the aggregated model updates, we propose a scheduling policy that jointly considers the channel quality and training data representation of user devices. The effectiveness of our channel-aware data-importance-based scheduling policy, compared with state-of-the-art methods proposed for synchronous FL, is validated through simulations. Moreover, we show that an ``age-aware'' aggregation weighting design can significantly improve the learning performance in an asynchronous FL setting.
翻译:联邦学习(FL)是一种协作式机器学习(ML)框架,通过结合设备端训练与服务器端聚合,在分布式智能体间共同训练一个通用的机器学习模型。本文针对FL系统中的滞后(straggler)问题,提出了一种采用周期性聚合的异步联邦学习设计方案。考虑到有限的无线通信资源,我们研究了不同调度策略与聚合设计对收敛性能的影响。基于降低聚合模型更新偏差与方差的重要性,我们提出了一种联合考虑用户设备信道质量与训练数据表征的调度策略。通过与同步FL领域最新方法的仿真对比,验证了所提出的基于信道感知与数据重要性的调度策略的有效性。此外,我们证明在异步FL场景中,“年龄感知”聚合加权设计能够显著提升学习性能。