Federated learning (FL) has emerged as a promising privacy-preserving distributed machine learning framework recently. It aims at collaboratively learning a shared global model by performing distributed training locally on edge devices and aggregating local models into a global one without centralized raw data sharing in the cloud server. However, due to the large local data heterogeneities (Non-I.I.D. data) across edge devices, the FL may easily obtain a global model that can produce more shifted gradients on local datasets, thereby degrading the model performance or even suffering from the non-convergence during training. In this paper, we propose a novel FL training framework, dubbed Fed-FSNet, using a properly designed Fuzzy Synthesizing Network (FSNet) to mitigate the Non-I.I.D. FL at-the-source. Concretely, we maintain an edge-agnostic hidden model in the cloud server to estimate a less-accurate while direction-aware inversion of the global model. The hidden model can then fuzzily synthesize several mimic I.I.D. data samples (sample features) conditioned on only the global model, which can be shared by edge devices to facilitate the FL training towards faster and better convergence. Moreover, since the synthesizing process involves neither access to the parameters/updates of local models nor analyzing individual local model outputs, our framework can still ensure the privacy of FL. Experimental results on several FL benchmarks demonstrate that our method can significantly mitigate the Non-I.I.D. issue and obtain better performance against other representative methods.
翻译:联邦学习(FL)近期已成为一种具有前景的保护隐私的分布式机器学习框架。其目标是通过在边缘设备上本地执行分布式训练,并将局部模型聚合为全局模型(无需在云服务器集中共享原始数据),协作学习一个共享的全局模型。然而,由于跨边缘设备存在较大的本地数据异质性(非独立同分布数据),联邦学习容易获得一个对本地数据集产生更大偏移梯度的全局模型,进而降低模型性能,甚至在训练过程中出现不收敛问题。本文提出一种新颖的联邦学习训练框架Fed-FSNet,该框架采用精心设计的模糊合成网络(FSNet)从源头缓解非独立同分布联邦学习问题。具体而言,我们在云服务器中维护一个边缘不可知的隐藏模型,用于估计全局模型的低精度但方向感知的逆映射。该隐藏模型随后可仅基于全局模型模糊合成若干模拟独立同分布数据样本(样本特征),这些样本可由边缘设备共享,从而促进联邦学习训练实现更快、更优的收敛。此外,由于合成过程既无需访问局部模型的参数/更新,也无需分析单个局部模型的输出,我们的框架仍能保障联邦学习的隐私性。在多个联邦学习基准上的实验结果表明,与其它代表性方法相比,我们的方法能显著缓解非独立同分布问题并获得更优性能。