Data Heterogeneity is a major challenge of Federated Learning performance. Recently, momentum based optimization techniques have beed proved to be effective in mitigating the heterogeneity issue. Along with the model updates, the momentum updates are transmitted to the server side and aggregated. Therefore, the local training initialized with a global momentum is guided by the global history of the gradients. However, we spot a problem in the traditional cumulation of the momentum which is suboptimal in the Federated Learning systems. The momentum used to weight less on the historical gradients and more on the recent gradients. This however, will engage more biased local gradients in the end of the local training. In this work, we propose a new way to calculate the estimated momentum used in local initialization. The proposed method is named as Reversed Momentum Federated Learning (RMFL). The key idea is to assign exponentially decayed weights to the gradients with the time going forward, which is on the contrary to the traditional momentum cumulation. The effectiveness of RMFL is evaluated on three popular benchmark datasets with different heterogeneity levels.
翻译:数据异构性是联邦学习性能面临的主要挑战。近期研究表明,基于动量的优化技术能有效缓解异构性问题。在模型更新的同时,动量更新也被传输至服务器端进行聚合。因此,采用全局动量初始化的本地训练会受到梯度全局历史信息的引导。然而,我们发现传统动量累积方式在联邦学习系统中存在次优问题:传统动量对历史梯度赋予较小权重,而对近期梯度赋予较大权重,这将导致本地训练后期引入更多有偏的局部梯度。本研究提出一种用于本地初始化的估计动量计算新方法,称为反向动量联邦学习(RMFL)。其核心思想是随时间推移对梯度赋予指数衰减权重,这与传统动量累积方式相反。我们在三种不同异构程度的常用基准数据集上评估了RMFL的有效性。