Federated Learning (FL) has emerged as a privacy-preserving method for training machine learning models in a distributed manner on edge devices. However, on-device models face inherent computational power and memory limitations, potentially resulting in constrained gradient updates. As the model's size increases, the frequency of gradient updates on edge devices decreases, ultimately leading to suboptimal training outcomes during any particular FL round. This limits the feasibility of deploying advanced and large-scale models on edge devices, hindering the potential for performance enhancements. To address this issue, we propose FedRepOpt, a gradient re-parameterized optimizer for FL. The gradient re-parameterized method allows training a simple local model with a similar performance as a complex model by modifying the optimizer's gradients according to a set of model-specific hyperparameters obtained from the complex models. In this work, we focus on VGG-style and Ghost-style models in the FL environment. Extensive experiments demonstrate that models using FedRepOpt obtain a significant boost in performance of 16.7% and 11.4% compared to the RepGhost-style and RepVGG-style networks, while also demonstrating a faster convergence time of 11.7% and 57.4% compared to their complex structure.
翻译:联邦学习(FL)作为一种隐私保护方法,已在边缘设备上以分布式方式训练机器学习模型。然而,设备端模型面临固有的计算能力和内存限制,可能导致梯度更新受限。随着模型规模的增大,边缘设备上梯度更新的频率降低,最终导致在任一特定FL轮次中出现次优的训练结果。这限制了在边缘设备上部署先进大规模模型的可行性,阻碍了性能提升的潜力。为解决这一问题,我们提出FedRepOpt,一种用于FL的梯度重参数化优化器。该梯度重参数化方法通过根据从复杂模型中获取的一组模型特定超参数来修改优化器的梯度,从而能够训练一个性能与复杂模型相似的简单本地模型。在本工作中,我们重点关注FL环境中的VGG风格和Ghost风格模型。大量实验表明,使用FedRepOpt的模型相比RepGhost风格和RepVGG风格网络,性能分别显著提升了16.7%和11.4%,同时相比其复杂结构,收敛时间分别加快了11.7%和57.4%。