In a vertical federated learning (VFL) system consisting of a central server and many distributed clients, the training data are vertically partitioned such that different features are privately stored on different clients. The problem of split VFL is to train a model split between the server and the clients. This paper aims to address two major challenges in split VFL: 1) performance degradation due to straggling clients during training; and 2) data and model privacy leakage from clients' uploaded data embeddings. We propose FedVS to simultaneously address these two challenges. The key idea of FedVS is to design secret sharing schemes for the local data and models, such that information-theoretical privacy against colluding clients and curious server is guaranteed, and the aggregation of all clients' embeddings is reconstructed losslessly, via decrypting computation shares from the non-straggling clients. Extensive experiments on various types of VFL datasets (including tabular, CV, and multi-view) demonstrate the universal advantages of FedVS in straggler mitigation and privacy protection over baseline protocols.
翻译:在由中央服务器与多个分布式客户端组成的纵向联邦学习(VFL)系统中,训练数据按纵向划分,使得不同特征以私密方式存储于不同客户端。拆分式VFL的核心任务是在服务器与客户端之间训练一个分裂模型。本文旨在解决拆分式VFL的两大核心挑战:1)训练过程中因掉队客户端导致的性能退化;2)客户端上传数据嵌入引发的数据与模型隐私泄露。我们提出FedVS方案以同时解决这两个问题。其核心思想是为本地数据与模型设计秘密共享机制,使得在信息论层面保证客户端与服务器之间的隐私安全,并通过解密非掉队客户端的计算份额无损重构所有客户端嵌入的聚合结果。在多种VFL数据集(包括表格数据、计算机视觉与多视图数据)上的大量实验表明,FedVS在缓解掉队节点与隐私保护方面相比基线协议具有普遍优势。