Most work in privacy-preserving federated learning (FL) has been focusing on horizontally partitioned datasets where clients share the same sets of features and can train complete models independently. However, in many interesting problems, individual data points are scattered across different clients/organizations in a vertical setting. Solutions for this type of FL require the exchange of intermediate outputs and gradients between participants, posing a potential risk of privacy leakage when privacy and security concerns are not considered. In this work, we present vFedSec - a novel design with an innovative Secure Layer for training vertical FL securely and efficiently using state-of-the-art security modules in secure aggregation. We theoretically demonstrate that our method does not impact the training performance while protecting private data effectively. Empirically results also show its applicability with extensive experiments that our design can achieve the protection with negligible computation and communication overhead. Also, our method can obtain 9.1e2 ~ 3.8e4 speedup compared to widely-adopted homomorphic encryption (HE) method.
翻译:大多数隐私保护联邦学习(FL)研究聚焦于水平划分数据集,其中客户端共享相同特征集并可独立训练完整模型。然而在许多重要问题中,个体数据点以纵向方式分散于不同客户端/组织。此类FL的解决方案要求参与者之间交换中间输出与梯度,若未考虑隐私与安全问题,则存在隐私泄露风险。本研究提出vFedSec——一种创新设计,通过引入新型安全层,利用安全聚合领域最先进的安全模块实现纵向FL的安全高效训练。我们从理论上证明,该方法在保护私有数据的同时不影响训练性能。大量实验的实证结果也展示了其适用性——我们的设计能够以可忽略的计算与通信开销实现数据保护。此外,与广泛采用的同态加密(HE)方法相比,我们的方法可实现9.1e2至3.8e4倍的加速比。