Federated learning (FL) is a popular privacy-preserving distributed training scheme, where multiple devices collaborate to train machine learning models by uploading local model updates. To improve communication efficiency, over-the-air computation (AirComp) has been applied to FL, which leverages analog modulation to harness the superposition property of radio waves such that numerous devices can upload their model updates concurrently for aggregation. However, the uplink channel noise incurs considerable model aggregation distortion, which is critically determined by the device scheduling and compromises the learned model performance. In this paper, we propose a probabilistic device scheduling framework for over-the-air FL, named PO-FL, to mitigate the negative impact of channel noise, where each device is scheduled according to a certain probability and its model update is reweighted using this probability in aggregation. We prove the unbiasedness of this aggregation scheme and demonstrate the convergence of PO-FL on both convex and non-convex loss functions. Our convergence bounds unveil that the device scheduling affects the learning performance through the communication distortion and global update variance. Based on the convergence analysis, we further develop a channel and gradient-importance aware algorithm to optimize the device scheduling probabilities in PO-FL. Extensive simulation results show that the proposed PO-FL framework with channel and gradient-importance awareness achieves faster convergence and produces better models than baseline methods.
翻译:联邦学习(FL)是一种流行的隐私保护分布式训练方案,多个设备通过上传本地模型更新来协作训练机器学习模型。为提升通信效率,空中计算(AirComp)被应用于FL,其利用模拟调制来利用无线电波的叠加特性,使得大量设备能够同时上传模型更新以进行聚合。然而,上行链路信道噪声会导致严重的模型聚合失真,该失真关键取决于设备调度策略,并损害学习模型的性能。本文提出一种面向空中FL的概率性设备调度框架PO-FL,以减轻信道噪声的负面影响:该机制按特定概率调度每个设备,并在聚合时依据该概率对模型更新进行重新加权。我们证明了该聚合方案的无偏性,并论证了PO-FL在凸与非凸损失函数上的收敛性。收敛界揭示了设备调度通过通信失真与全局更新方差影响学习性能。基于收敛分析,我们进一步开发了通道与梯度重要性感知算法,用于优化PO-FL中的设备调度概率。大量仿真结果表明,所提出的具有通道与梯度重要性感知能力的PO-FL框架相比基准方法能实现更快的收敛速度并生成更优的模型。