Federated learning has gained popularity as a means of training models distributed across the wireless edge. The paper introduces delay-aware hierarchical federated learning (DFL) to improve the efficiency of distributed machine learning (ML) model training by accounting for communication delays between edge and cloud. Different from traditional federated learning, DFL leverages multiple stochastic gradient descent iterations on device datasets within each global aggregation period and intermittently aggregates model parameters through edge servers in local subnetworks. During global synchronization, the cloud server consolidates local models with the outdated global model using a local-global combiner, thus preserving crucial elements of both, enhancing learning efficiency under the presence of delay. A set of conditions is obtained to achieve the sub-linear convergence rate of O(1/k). Based on these findings, an adaptive control algorithm is developed for DFL, implementing policies to mitigate energy consumption and communication latency while aiming for a sublinear convergence rate. Numerical evaluations show DFL's superior performance in terms of faster global model convergence, reduced resource consumption, and robustness against communication delays compared to existing FL algorithms. In summary, this proposed method offers improved efficiency and results when dealing with both convex and non-convex loss functions.
翻译:联邦学习因其在无线边缘分布式训练模型方面的能力而广受关注。本文提出延迟感知的层次化联邦学习(DFL),通过考虑边缘与云之间的通信延迟,提升分布式机器学习(ML)模型训练的效率。与传统联邦学习不同,DFL在每个全局聚合周期内利用设备数据集的多次随机梯度下降迭代,并通过边缘服务器在局部子网中间歇性地聚合模型参数。全局同步时,云服务器使用局部-全局组合器将局部模型与过时的全局模型相结合,从而保留两者的关键要素,在存在延迟的情况下提升学习效率。本文推导出一组条件以实现O(1/k)的次线性收敛速率。基于这些发现,为DFL开发了一种自适应控制算法,实施策略以减少能量消耗和通信延迟,同时追求次线性收敛速率。数值评估表明,与现有联邦学习算法相比,DFL在更快的全局模型收敛、更低的资源消耗以及对通信延迟的鲁棒性方面表现更优。总之,该方法在处理凸和非凸损失函数时,提供了更高的效率和更好的结果。