Vertical Federated Learning (VFL) is a federated learning paradigm where multiple participants, who share the same set of samples but hold different features, jointly train machine learning models. Although VFL enables collaborative machine learning without sharing raw data, it is still susceptible to various privacy threats. In this paper, we conduct the first comprehensive survey of the state-of-the-art in privacy attacks and defenses in VFL. We provide taxonomies for both attacks and defenses, based on their characterizations, and discuss open challenges and future research directions. Specifically, our discussion is structured around the model's life cycle, by delving into the privacy threats encountered during different stages of machine learning and their corresponding countermeasures. This survey not only serves as a resource for the research community but also offers clear guidance and actionable insights for practitioners to safeguard data privacy throughout the model's life cycle.
翻译:垂直联邦学习(VFL)是一种联邦学习范式,其中多个参与者共享相同的样本集但持有不同的特征,共同训练机器学习模型。尽管VFL能够在不共享原始数据的情况下实现协作机器学习,但它仍然容易受到各种隐私威胁的影响。本文首次对VFL中隐私攻击与防御的最新研究进行了全面综述。我们基于攻击与防御的特征化方法,提供了两者的分类体系,并探讨了公开挑战与未来研究方向。具体而言,我们的讨论围绕模型生命周期展开,深入分析了机器学习不同阶段遇到的隐私威胁及其对应策略。本综述不仅为研究社区提供了资源,还为从业者在模型全生命周期中保护数据隐私提供了清晰的指导和可操作的见解。