Most work in privacy-preserving federated learning (FL) has focused on horizontally partitioned datasets where clients hold the same features and train complete client-level models independently. However, individual data points are often scattered across different institutions, known as clients, in vertical FL (VFL) settings. Addressing this category of FL necessitates the exchange of intermediate outputs and gradients among participants, resulting in potential privacy leakage risks and slow convergence rates. Additionally, in many real-world scenarios, VFL training also faces the acute issue of client stragglers and drop-outs, a serious challenge that can significantly hinder the training process but has been largely overlooked in existing studies. In this work, we present vFedSec, a first dropout-tolerant VFL protocol, which can support the most generalized vertical framework. It achieves secure and efficient model training by using an innovative Secure Layer alongside an embedding-padding technique. We provide theoretical proof that our design attains enhanced security while maintaining training performance. Empirical results from extensive experiments also demonstrate vFedSec is robust to client dropout and provides secure training with negligible computation and communication overhead. Compared to widely adopted homomorphic encryption (HE) methods, our approach achieves a remarkable > 690x speedup and reduces communication costs significantly by > 9.6x.
翻译:隐私保护联邦学习(FL)的大多数研究集中于水平划分数据集,其中客户端拥有相同特征并独立训练完整的客户端级模型。然而在纵向联邦学习(VFL)场景中,个体数据点通常分散在不同机构(即客户端)之间。处理此类联邦学习需要在参与者之间交换中间输出和梯度,从而带来潜在的隐私泄露风险和较慢的收敛速度。此外,在许多现实场景中,VFL训练还面临客户端掉队与退出的严重问题——这一严峻挑战可能显著阻碍训练进程,但现有研究却很大程度上忽视了它。本文提出vFedSec,这是首个支持掉队的VFL协议,能够适配最通用的纵向框架。通过采用创新性的Secure层与嵌入填充技术,该协议实现了安全高效的模型训练。我们提供了理论证明,表明我们的设计在保持训练性能的同时达到了增强的安全性。大量实验的实证结果也表明,vFedSec对客户端退出具有鲁棒性,并能以可忽略的计算和通信开销提供安全训练。与广泛采用的同态加密(HE)方法相比,我们的方法实现了超过690倍的显著加速,并将通信成本降低了9.6倍以上。