Federated learning (FL), as an emerging distributed machine learning paradigm, allows a mass of edge devices to collaboratively train a global model while preserving privacy. In this tutorial, we focus on FL via over-the-air computation (AirComp), which is proposed to reduce the communication overhead for FL over wireless networks at the cost of compromising in the learning performance due to model aggregation error arising from channel fading and noise. We first provide a comprehensive study on the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both strongly convex and non-convex settings with constant and diminishing learning rates in the presence of data heterogeneity. Through convergence and asymptotic analysis, we characterize the impact of aggregation error on the convergence bound and provide insights for system design with convergence guarantees. Then we derive convergence rates for AirFedAvg algorithms for strongly convex and non-convex objectives. For different types of local updates that can be transmitted by edge devices (i.e., local model, gradient, and model difference), we reveal that transmitting local model in AirFedAvg may cause divergence in the training procedure. In addition, we consider more practical signal processing schemes to improve the communication efficiency and further extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes. Extensive simulation results under different settings of objective functions, transmitted local information, and communication schemes verify the theoretical conclusions.
翻译:联邦学习(FL)作为一种新兴的分布式机器学习范式,允许大量边缘设备在保护隐私的同时协同训练全局模型。本教程重点研究基于空中计算(AirComp)的联邦学习,该方法旨在降低无线网络FL的通信开销,但会因信道衰落与噪声导致的模型聚合误差而牺牲学习性能。我们首先系统研究了基于AirComp的FedAvg(AirFedAvg)算法在强凸与非凸场景下的收敛性,涵盖数据异构条件下常数与递减学习率的情形。通过收敛性与渐近分析,刻画聚合误差对收敛界的影响,并为保证收敛性的系统设计提供见解。随后推导了面向强凸与非凸目标的AirFedAvg算法收敛速率。针对边缘设备可传输的不同局部更新类型(即局部模型、梯度与模型差异),揭示AirFedAvg中传输局部模型可能导致训练过程发散。此外,我们考虑更实际的信号处理方案以提升通信效率,并进一步将收敛性分析扩展至这些信号处理方案引发的不同形式模型聚合误差。不同目标函数、传输局部信息类型与通信方案设置下的广泛仿真结果验证了理论结论。