Federated Learning (FL) has emerged as a fundamental learning paradigm to harness massive data scattered at geo-distributed edge devices in a privacy-preserving way. Given the heterogeneous deployment of edge devices, however, their data are usually Non-IID, introducing significant challenges to FL including degraded training accuracy, intensive communication costs, and high computing complexity. Towards that, traditional approaches typically utilize adaptive mechanisms, which may suffer from scalability issues, increased computational overhead, and limited adaptability to diverse edge environments. To address that, this paper instead leverages the observation that the computation offloading involves inherent functionalities such as node matching and service correlation to achieve data reshaping and proposes Federated learning based on computing Offloading (FlocOff) framework, to address data heterogeneity and resource-constrained challenges. Specifically, FlocOff formulates the FL process with Non-IID data in edge scenarios and derives rigorous analysis on the impact of imbalanced data distribution. Based on this, FlocOff decouples the optimization in two steps, namely : (1) Minimizes the Kullback-Leibler (KL) divergence via Computation Offloading scheduling (MKL-CO); (2) Minimizes the Communication Cost through Resource Allocation (MCC-RA). Extensive experimental results demonstrate that the proposed FlocOff effectively improves model convergence and accuracy by 14.3\%-32.7\% while reducing data heterogeneity under various data distributions.
翻译:联邦学习已成为一种基本学习范式,能够以隐私保护的方式利用地理分布式边缘设备上的海量数据。然而,鉴于边缘设备的异构部署特性,其数据通常呈现非独立同分布,这给联邦学习带来了显著挑战,包括训练精度下降、通信成本激增以及计算复杂度升高。针对此问题,传统方法通常采用自适应机制,但可能面临可扩展性不足、计算开销增大以及对多样化边缘环境适应能力有限等问题。为此,本文基于计算卸载所固有的节点匹配与服务关联等功能可实现数据重塑的观察,提出基于计算卸载的联邦学习框架FlocOff,以应对数据异构性与资源受限的双重挑战。具体而言,FlocOff首先形式化边缘场景下非独立同分布数据的联邦学习过程,并对数据分布不均衡的影响进行严格理论分析。在此基础上,FlocOff将优化问题解耦为两个步骤:(1)通过计算卸载调度最小化KL散度;(2)通过资源分配最小化通信成本。大量实验结果表明,所提出的FlocOff框架在不同数据分布下能有效降低数据异构性,将模型收敛速度与精度提升14.3%-32.7%。