Taming Subnet-Drift in D2D-Enabled Fog Learning: A Hierarchical Gradient Tracking Approach

Federated learning (FL) encounters scalability challenges when implemented over fog networks. Semi-decentralized FL (SD-FL) proposes a solution that divides model cooperation into two stages: at the lower stage, device-to-device (D2D) communications is employed for local model aggregations within subnetworks (subnets), while the upper stage handles device-server (DS) communications for global model aggregations. However, existing SD-FL schemes are based on gradient diversity assumptions that become performance bottlenecks as data distributions become more heterogeneous. In this work, we develop semi-decentralized gradient tracking (SD-GT), the first SD-FL methodology that removes the need for such assumptions by incorporating tracking terms into device updates for each communication layer. Analytical characterization of SD-GT reveals convergence upper bounds for both non-convex and strongly-convex problems, for a suitable choice of step size. We employ the resulting bounds in the development of a co-optimization algorithm for optimizing subnet sampling rates and D2D rounds according to a performance-efficiency trade-off. Our subsequent numerical evaluations demonstrate that SD-GT obtains substantial improvements in trained model quality and communication cost relative to baselines in SD-FL and gradient tracking on several datasets.

翻译：联邦学习（FL）在雾网络部署中面临可扩展性挑战。半去中心化联邦学习（SD-FL）提出一种解决方案，将模型协同划分为两个阶段：在底层阶段，采用设备到设备（D2D）通信实现子网内局部模型聚合；顶层阶段则通过设备到服务器（DS）通信完成全局模型聚合。然而现有SD-FL方案基于梯度多样性假设，该假设在数据分布日趋异构时成为性能瓶颈。本文提出半去中心化梯度追踪（SD-GT），这是首个通过将追踪项融入各通信层设备更新来消除此类假设的SD-FL方法。对SD-GT的分析性刻画揭示了其在非凸与强凸问题中采用适当步长时的收敛上界。基于这些收敛界，我们开发了共优化算法，依据性能-效率权衡优化子网采样率与D2D通信轮次。后续数值评估表明，在多个数据集上，SD-GT相较于SD-FL基准方案与梯度追踪方法，在训练模型质量与通信成本方面均获得显著提升。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/