Federated learning has gained popularity as a means of training models distributed across the wireless edge. The paper introduces delay-aware federated learning (DFL) to improve the efficiency of distributed machine learning (ML) model training by addressing communication delays between edge and cloud. DFL employs multiple stochastic gradient descent iterations on device datasets during each global aggregation interval and intermittently aggregates model parameters through edge servers in local subnetworks. The cloud server synchronizes the local models with the global deployed model computed via a local-global combiner at global synchronization. The convergence behavior of DFL is theoretically investigated under a generalized data heterogeneity metric. A set of conditions is obtained to achieve the sub-linear convergence rate of O(1/k). Based on these findings, an adaptive control algorithm is developed for DFL, implementing policies to mitigate energy consumption and edge-to-cloud communication latency while aiming for a sublinear convergence rate. Numerical evaluations show DFL's superior performance in terms of faster global model convergence, reduced resource consumption, and robustness against communication delays compared to existing FL algorithms. In summary, this proposed method offers improved efficiency and satisfactory results when dealing with both convex and non-convex loss functions.
翻译:联邦学习已成为跨无线边缘训练分布式模型的流行方法。本文提出延迟感知联邦学习(DFL),通过解决边缘与云端之间的通信延迟问题来提升分布式机器学习模型训练的能效。DFL在每个全局聚合间隔期间对设备数据集执行多次随机梯度下降迭代,并通过本地子网络中的边缘服务器间歇性聚合模型参数。云端服务器在全局同步阶段,通过本地-全局组合器计算全局部署模型,并与本地模型同步。针对广义数据异构性度量,本文从理论上研究了DFL的收敛特性。研究获得了达到O(1/k)次线性收敛速率所需的条件集,并据此开发了自适应控制算法,通过实施策略在追求次线性收敛速率的同时降低能耗与边缘-云端通信延迟。数值评估表明,与现有联邦学习算法相比,DFL在全局模型快速收敛、资源消耗减少及通信延迟鲁棒性方面表现更优。综上所述,本文提出的方法在处理凸函数与非凸损失函数时均能显著提升效率,获得令人满意的结果。