In a vertical federated learning (VFL) system consisting of a central server and many distributed clients, the training data are vertically partitioned such that different features are privately stored on different clients. The problem of split VFL is to train a model split between the server and the clients. This paper aims to address two major challenges in split VFL: 1) performance degradation due to straggling clients during training; and 2) data and model privacy leakage from clients' uploaded data embeddings. We propose FedVS to simultaneously address these two challenges. The key idea of FedVS is to design secret sharing schemes for the local data and models, such that information-theoretical privacy against colluding clients and curious server is guaranteed, and the aggregation of all clients' embeddings is reconstructed losslessly, via decrypting computation shares from the non-straggling clients. Extensive experiments on various types of VFL datasets (including tabular, CV, and multi-view) demonstrate the universal advantages of FedVS in straggler mitigation and privacy protection over baseline protocols.
翻译:在由中央服务器和多个分布式客户端组成的纵向联邦学习(VFL)系统中,训练数据按纵向划分,使得不同特征分别存储在不同客户端上。分割VFL的目标是训练一个在服务器和客户端之间分割的模型。本文旨在解决分割VFL中的两个主要挑战:1)训练中因掉队客户端导致的性能下降;以及2)客户端上传数据嵌入所引发的数据和模型隐私泄露。我们提出FedVS以同时应对这两个挑战。FedVS的核心思想是为本地数据和模型设计秘密共享方案,从而保证对抗共谋客户端和好奇服务器的信息论级隐私,并通过从非掉队客户端解密计算份额无损重建所有客户端嵌入的聚合结果。在各类VFL数据集(包括表格、计算机视觉和多视图)上进行的大量实验表明,相较于基线协议,FedVS在缓解掉队问题和隐私保护方面具有普遍优势。