Federated learning (FL) is an emerging paradigm for decentralized training of machine learning models on distributed clients, without revealing the data to the central server. The learning scheme may be horizontal, vertical or hybrid (both vertical and horizontal). Most existing research work with deep neural network (DNN) modelling is focused on horizontal data distributions, while vertical and hybrid schemes are much less studied. In this paper, we propose a generalized algorithm FedEmb, for modelling vertical and hybrid DNN-based learning. The idea of our algorithm is characterised by higher inference accuracy, stronger privacy-preserving properties, and lower client-server communication bandwidth demands as compared with existing work. The experimental results show that FedEmb is an effective method to tackle both split feature & subject space decentralized problems, shows 0.3% to 4.2% inference accuracy improvement with limited privacy revealing for datasets stored in local clients, and reduces 88.9 % time complexity over vertical baseline method.
翻译:摘要:联邦学习是一种新兴范式,可在分布式客户端上实现机器学习模型的去中心化训练,而无需将数据暴露给中央服务器。其学习模式可分为横向、纵向或混合(同时包含纵向与横向)。现有大多数基于深度神经网络(DNN)建模的研究侧重于横向数据分布,而纵向与混合方案的研究则相对匮乏。本文提出了一种通用算法FedEmb,用于构建基于DNN的纵向与混合学习模型。与现有工作相比,该算法的核心特点在于更高的推理精度、更强的隐私保护特性,以及更低的客户端-服务器通信带宽需求。实验结果表明,FedEmb是解决特征与样本空间双重分散问题的有效方法,可在本地客户端数据集上以有限的隐私泄露实现0.3%至4.2%的推理精度提升,并将时间复杂度的纵向基线方法降低88.9%。