Federated Learning (FL) has emerged as a privacy-preserving method for training machine learning models in a distributed manner on edge devices. However, on-device models face inherent computational power and memory limitations, potentially resulting in constrained gradient updates. As the model's size increases, the frequency of gradient updates on edge devices decreases, ultimately leading to suboptimal training outcomes during any particular FL round. This limits the feasibility of deploying advanced and large-scale models on edge devices, hindering the potential for performance enhancements. To address this issue, we propose FedRepOpt, a gradient re-parameterized optimizer for FL. The gradient re-parameterized method allows training a simple local model with a similar performance as a complex model by modifying the optimizer's gradients according to a set of model-specific hyperparameters obtained from the complex models. In this work, we focus on VGG-style and Ghost-style models in the FL environment. Extensive experiments demonstrate that models using FedRepOpt obtain a significant boost in performance of 16.7% and 11.4% compared to the RepGhost-style and RepVGG-style networks, while also demonstrating a faster convergence time of 11.7% and 57.4% compared to their complex structure.
翻译:联邦学习(Federated Learning, FL)作为一种隐私保护方法,已在边缘设备上以分布式方式训练机器学习模型。然而,设备端模型面临固有的计算能力和内存限制,可能导致梯度更新受限。随着模型规模增大,边缘设备上的梯度更新频率降低,最终在任一特定联邦学习轮次中导致次优的训练结果。这限制了在边缘设备上部署先进大规模模型的可行性,阻碍了性能提升的潜力。为解决此问题,我们提出FedRepOpt,一种用于联邦学习的梯度重参数化优化器。该梯度重参数化方法通过根据从复杂模型获得的一组模型特定超参数修改优化器的梯度,使得训练一个简单局部模型能达到与复杂模型相近的性能。在本工作中,我们聚焦于联邦学习环境中的VGG风格与Ghost风格模型。大量实验表明,使用FedRepOpt的模型相比RepGhost风格与RepVGG风格网络,性能分别显著提升了16.7%和11.4%,同时相比其复杂结构,收敛时间分别加快了11.7%和57.4%。