The conventional federated learning (FedL) architecture distributes machine learning (ML) across worker devices by having them train local models that are periodically aggregated by a server. FedL ignores two important characteristics of contemporary wireless networks, however: (i) the network may contain heterogeneous communication/computation resources, and (ii) there may be significant overlaps in devices' local data distributions. In this work, we develop a novel optimization methodology that jointly accounts for these factors via intelligent device sampling complemented by device-to-device (D2D) offloading. Our optimization methodology aims to select the best combination of sampled nodes and data offloading configuration to maximize FedL training accuracy while minimizing data processing and D2D communication resource consumption subject to realistic constraints on the network topology and device capabilities. Theoretical analysis of the D2D offloading subproblem leads to new FedL convergence bounds and an efficient sequential convex optimizer. Using these results, we develop a sampling methodology based on graph convolutional networks (GCNs) which learns the relationship between network attributes, sampled nodes, and D2D data offloading to maximize FedL accuracy. Through evaluation on popular datasets and real-world network measurements from our edge testbed, we find that our methodology outperforms popular device sampling methodologies from literature in terms of ML model performance, data processing overhead, and energy consumption.
翻译:传统的联邦学习(FedL)架构通过在多个工作设备上训练本地模型,并定期由服务器进行聚合,从而实现机器学习的分布式训练。然而,FedL忽略了现代无线网络的两个重要特性:(i)网络可能包含异构的通信/计算资源;(ii)设备本地数据分布可能存在显著重叠。本文提出了一种新颖的优化方法,通过智能设备采样辅以设备到设备(D2D)卸载,共同考虑这些因素。我们的优化方法旨在选择最佳的采样节点组合与数据卸载配置,以最大化FedL训练精度,同时最小化数据处理与D2D通信资源消耗,并满足网络拓扑与设备能力的实际约束。通过对D2D卸载子问题的理论分析,我们推导出新的FedL收敛界,并提出一种高效的序列凸优化器。基于这些结果,我们开发了一种基于图卷积网络(GCN)的采样方法,该方法学习网络属性、采样节点与D2D数据卸载之间的关系,以最大化FedL精度。通过在流行数据集以及我们边缘测试平台的实际网络测量数据上进行评估,我们发现该方法在机器学习模型性能、数据处理开销和能耗方面均优于文献中常见的设备采样方法。