Federated Learning (FL) is a novel machine learning framework, which enables multiple distributed devices cooperatively to train a shared model scheduled by a central server while protecting private data locally. However, the non-independent-and-identically-distributed (Non-IID) data samples and frequent communication across participants may significantly slow down the convergent rate and increase communication costs. To achieve fast convergence, we ameliorate the conventional local updating rule by introducing the aggregated gradients at each local update epoch, and propose an adaptive learning rate algorithm that further takes the deviation of local parameter and global parameter into consideration. The above adaptive learning rate design requires all clients' local information including the local parameters and gradients, which is challenging as there is no communication during the local update epochs. To obtain a decentralized adaptive learning rate for each client, we utilize the mean field approach by introducing two mean field terms to estimate the average local parameters and gradients respectively, which does not require the clients to exchange their local information with each other at each local epoch. Numerical results show that our proposed framework is superior to the state-of-art FL schemes in both model accuracy and convergent rate for IID and Non-IID datasets.
翻译:联邦学习(FL)是一种新型机器学习框架,它允许多个分布式设备在中央服务器协调下协同训练共享模型,同时将私有数据保留在本地。然而,非独立同分布(Non-IID)的数据样本以及参与者之间的频繁通信可能显著降低收敛速度并增加通信成本。为实现快速收敛,我们通过在每次本地更新轮次引入聚合梯度来改进传统的本地更新规则,并提出一种自适应学习率算法,该算法进一步考虑了本地参数与全局参数之间的偏差。上述自适应学习率设计需要所有客户端的本地信息(包括本地参数和梯度),但由于本地更新轮次期间没有通信,这具有挑战性。为每个客户端获得分布式自适应学习率,我们利用平均场方法,引入两个平均场项分别估计平均本地参数和梯度,该方法无需客户端在每个本地轮次交换各自的本地信息。数值结果表明,对于IID和Non-IID数据集,我们提出的框架在模型精度和收敛速度两方面均优于现有最先进的联邦学习方案。