Federated Learning (FL) has emerged as a promising distributed learning paradigm with an added advantage of data privacy. With the growing interest in having collaboration among data owners, FL has gained significant attention of organizations. The idea of FL is to enable collaborating participants train machine learning (ML) models on decentralized data without breaching privacy. In simpler words, federated learning is the approach of ``bringing the model to the data, instead of bringing the data to the mode''. Federated learning, when applied to data which is partitioned vertically across participants, is able to build a complete ML model by combining local models trained only using the data with distinct features at the local sites. This architecture of FL is referred to as vertical federated learning (VFL), which differs from the conventional FL on horizontally partitioned data. As VFL is different from conventional FL, it comes with its own issues and challenges. In this paper, we present a structured literature review discussing the state-of-the-art approaches in VFL. Additionally, the literature review highlights the existing solutions to challenges in VFL and provides potential research directions in this domain.
翻译:联邦学习(FL)作为一种新兴的分布式学习范式,在具备数据隐私保护优势的同时,已逐渐成为促进数据拥有者之间协作的研究热点,并受到各组织的广泛关注。联邦学习的核心理念是使协作参与方能够在保护隐私的前提下,利用分散数据训练机器学习(ML)模型。简而言之,联邦学习采用"将模型带到数据处,而非将数据带到模型处"的方法。当联邦学习应用于纵向划分的数据时,通过整合各本地站点仅利用独特特征训练的局部模型,能够构建完整的ML模型。这种联邦学习架构被称为纵向联邦学习(VFL),其与传统基于横向划分数据的联邦学习存在本质差异。由于VFL区别于传统联邦学习,其自身面临特有的问题与挑战。本文通过结构化文献综述,系统梳理了VFL领域的最新研究进展,重点阐释了现有解决方案对VFL挑战的应对策略,并展望了该领域的潜在研究方向。