Vertical federated learning (VFL) is a promising category of federated learning for the scenario where data is vertically partitioned and distributed among parties. VFL enriches the description of samples using features from different parties to improve model capacity. Compared with horizontal federated learning, in most cases, VFL is applied in the commercial cooperation scenario of companies. Therefore, VFL contains tremendous business values. In the past few years, VFL has attracted more and more attention in both academia and industry. In this paper, we systematically investigate the current work of VFL from a layered perspective. From the hardware layer to the vertical federated system layer, researchers contribute to various aspects of VFL. Moreover, the application of VFL has covered a wide range of areas, e.g., finance, healthcare, etc. At each layer, we categorize the existing work and explore the challenges for the convenience of further research and development of VFL. Especially, we design a novel MOSP tree taxonomy to analyze the core component of VFL, i.e., secure vertical federated machine learning algorithm. Our taxonomy considers four dimensions, i.e., machine learning model (M), protection object (O), security model (S), and privacy-preserving protocol (P), and provides a comprehensive investigation.
翻译:纵向联邦学习(vertical federated learning, VFL)是联邦学习的一种重要范式,适用于数据按纵向划分且分布于不同参与方的场景。VFL通过整合不同参与方的特征来丰富样本描述,从而提升模型能力。相较于横向联邦学习,VFL多数情况下应用于企业间的商业合作场景,因此蕴含巨大的商业价值。近年来,VFL在学术界和工业界日益受到关注。本文从分层视角系统梳理了VFL的现有工作:从硬件层到纵向联邦系统层,研究者们对VFL的各个层面均有贡献。此外,VFL的应用已涵盖金融、医疗等广泛领域。在各层级中,我们对现有工作进行分类,并探讨其发展挑战,以期为VFL的后续研究与应用提供便利。特别地,我们设计了一种新型MOSP树分类法,用于分析VFL的核心组件——安全的纵向联邦学习算法。该分类法从机器学习模型(M)、保护对象(O)、安全模型(S)和隐私保护协议(P)四个维度展开,并进行了全面探究。