FedRepOpt: Gradient Re-parameterized Optimizers in Federated Learning

Federated Learning (FL) has emerged as a privacy-preserving method for training machine learning models in a distributed manner on edge devices. However, on-device models face inherent computational power and memory limitations, potentially resulting in constrained gradient updates. As the model's size increases, the frequency of gradient updates on edge devices decreases, ultimately leading to suboptimal training outcomes during any particular FL round. This limits the feasibility of deploying advanced and large-scale models on edge devices, hindering the potential for performance enhancements. To address this issue, we propose FedRepOpt, a gradient re-parameterized optimizer for FL. The gradient re-parameterized method allows training a simple local model with a similar performance as a complex model by modifying the optimizer's gradients according to a set of model-specific hyperparameters obtained from the complex models. In this work, we focus on VGG-style and Ghost-style models in the FL environment. Extensive experiments demonstrate that models using FedRepOpt obtain a significant boost in performance of 16.7% and 11.4% compared to the RepGhost-style and RepVGG-style networks, while also demonstrating a faster convergence time of 11.7% and 57.4% compared to their complex structure.

翻译：联邦学习（Federated Learning, FL）作为一种隐私保护方法，已在边缘设备上以分布式方式训练机器学习模型。然而，设备端模型面临固有的计算能力和内存限制，可能导致梯度更新受限。随着模型规模增大，边缘设备上的梯度更新频率降低，最终在任一特定联邦学习轮次中导致次优的训练结果。这限制了在边缘设备上部署先进大规模模型的可行性，阻碍了性能提升的潜力。为解决此问题，我们提出FedRepOpt，一种用于联邦学习的梯度重参数化优化器。该梯度重参数化方法通过根据从复杂模型获得的一组模型特定超参数修改优化器的梯度，使得训练一个简单局部模型能达到与复杂模型相近的性能。在本工作中，我们聚焦于联邦学习环境中的VGG风格与Ghost风格模型。大量实验表明，使用FedRepOpt的模型相比RepGhost风格与RepVGG风格网络，性能分别显著提升了16.7%和11.4%，同时相比其复杂结构，收敛时间分别加快了11.7%和57.4%。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日