Federated learning (FL) is an emerging paradigm for decentralized training of machine learning models on distributed clients, without revealing the data to the central server. The learning scheme may be horizontal, vertical or hybrid (both vertical and horizontal). Most existing research work with deep neural network (DNN) modelling is focused on horizontal data distributions, while vertical and hybrid schemes are much less studied. In this paper, we propose a generalized algorithm FedEmb, for modelling vertical and hybrid DNN-based learning. The idea of our algorithm is characterised by higher inference accuracy, stronger privacy-preserving properties, and lower client-server communication bandwidth demands as compared with existing work. The experimental results show that FedEmb is an effective method to tackle both split feature & subject space decentralized problems, shows 0.3% to 4.2% inference accuracy improvement with limited privacy revealing for datasets stored in local clients, and reduces 88.9 % time complexity over vertical baseline method.
翻译:联邦学习(FL)是一种新兴范式,用于在分布式客户端上进行机器学习模型的去中心化训练,同时无需将数据暴露给中央服务器。其学习模式可分为横向、纵向或混合(同时包含纵向和横向)三种类型。现有基于深度神经网络(DNN)建模的研究工作主要集中于横向数据分布场景,而纵向及混合方案的研究相对较少。本文提出一种通用化算法FedEmb,用于实现基于DNN的纵向与混合学习建模。相较于现有工作,该算法的特点在于:更高推理精度、更强隐私保护能力,以及更低的客户端-服务器通信带宽需求。实验结果表明,FedEmb是解决分裂特征空间与分裂样本空间去中心化问题的有效方法——在本地客户端数据集的隐私泄露受限条件下,推理精度提升0.3%至4.2%,且相较纵向基线方法降低了88.9%的时间复杂度。