Federated learning (FL) is a popular privacy-preserving distributed training scheme, where multiple devices collaborate to train machine learning models by uploading local model updates. To improve communication efficiency, over-the-air computation (AirComp) has been applied to FL, which leverages analog modulation to harness the superposition property of radio waves such that numerous devices can upload their model updates concurrently for aggregation. However, the uplink channel noise incurs considerable model aggregation distortion, which is critically determined by the device scheduling and compromises the learned model performance. In this paper, we propose a probabilistic device scheduling framework for over-the-air FL, named PO-FL, to mitigate the negative impact of channel noise, where each device is scheduled according to a certain probability and its model update is reweighted using this probability in aggregation. We prove the unbiasedness of this aggregation scheme and demonstrate the convergence of PO-FL on both convex and non-convex loss functions. Our convergence bounds unveil that the device scheduling affects the learning performance through the communication distortion and global update variance. Based on the convergence analysis, we further develop a channel and gradient-importance aware algorithm to optimize the device scheduling probabilities in PO-FL. Extensive simulation results show that the proposed PO-FL framework with channel and gradient-importance awareness achieves faster convergence and produces better models than baseline methods.
翻译:联邦学习(FL)是一种注重隐私保护的分布式训练范式,通过多个设备协作上传本地模型更新来训练机器学习模型。为提升通信效率,空中计算(AirComp)被引入FL,其利用模拟调制技术中无线电波的叠加特性,使大量设备能够同步上传模型更新进行聚合。然而,上行信道噪声会引发显著的模型聚合失真,该失真主要受设备调度策略影响,并最终损害学习模型的性能。本文提出面向空中联邦学习的概率式设备调度框架PO-FL,通过按特定概率调度设备并在聚合过程中用该概率对模型更新重新加权,有效缓解信道噪声的负面影响。我们证明了该聚合方案的无偏性,并验证了PO-FL在凸损失函数与非凸损失函数上的收敛性。收敛界分析表明,设备调度通过通信失真和全局更新方差影响学习性能。基于收敛分析,我们进一步提出信道与梯度重要性感知算法以优化PO-FL中的设备调度概率。大量仿真结果表明,具有信道与梯度重要性感知能力的PO-FL框架在收敛速度和模型质量上均优于基准方法。