Toward efficient resource utilization at edge nodes in federated learning

Federated learning (FL) enables edge nodes to collaboratively contribute to constructing a global model without sharing their data. This is accomplished by devices computing local, private model updates that are then aggregated by a server. However, computational resource constraints and network communication can become a severe bottleneck for larger model sizes typical for deep learning applications. Edge nodes tend to have limited hardware resources (RAM, CPU), and the network bandwidth and reliability at the edge is a concern for scaling federated fleet applications. In this paper, we propose and evaluate a FL strategy inspired by transfer learning in order to reduce resource utilization on devices, as well as the load on the server and network in each global training round. For each local model update, we randomly select layers to train, freezing the remaining part of the model. In doing so, we can reduce both server load and communication costs per round by excluding all untrained layer weights from being transferred to the server. The goal of this study is to empirically explore the potential trade-off between resource utilization on devices and global model convergence under the proposed strategy. We implement the approach using the federated learning framework FEDn. A number of experiments were carried out over different datasets (CIFAR-10, CASA, and IMDB), performing different tasks using different deep-learning model architectures. Our results show that training the model partially can accelerate the training process, efficiently utilizes resources on-device, and reduce the data transmission by around 75% and 53% when we train 25%, and 50% of the model layers, respectively, without harming the resulting global model accuracy.

翻译：联邦学习（FL）使得边缘节点能够在不共享数据的情况下协作构建全局模型。该机制通过设备计算本地、私有的模型更新，随后由服务器进行聚合来实现。然而，对于深度学习应用中常见的大规模模型，计算资源限制和网络通信可能成为严重的瓶颈。边缘节点通常硬件资源（内存、CPU）有限，且边缘的网络带宽与可靠性是扩展联邦规模应用时需关注的问题。本文提出并评估了一种受迁移学习启发的联邦学习策略，旨在降低设备资源利用率，并减少每轮全局训练中服务器与网络的负载。对于每次本地模型更新，我们随机选择部分层进行训练，同时冻结模型的其余部分。通过这种方式，我们可以将所有未训练层的权重排除在向服务器传输的范围之外，从而降低每轮的服务器负载与通信开销。本研究的目标是实证探索所提策略下设备资源利用率与全局模型收敛性之间潜在的权衡关系。我们使用联邦学习框架FEDn实现了该方法。在不同数据集（CIFAR-10、CASA和IMDB）上进行了多组实验，使用不同的深度学习模型架构执行不同任务。实验结果表明，部分训练模型能够加速训练过程，高效利用设备资源，并且在训练25%和50%的模型层时，分别减少约75%和53%的数据传输量，同时不损害最终全局模型的精度。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

（CVPR2021）基于结构保持的弱监督目标定位

专知会员服务

21+阅读 · 2021年5月1日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

专知会员服务

34+阅读 · 2020年1月15日