Split Neural Network, as one of the most common architectures used in vertical federated learning, is popular in industry due to its privacy-preserving characteristics. In this architecture, the party holding the labels seeks cooperation from other parties to improve model performance due to insufficient feature data. Each of these participants has a self-defined bottom model to learn hidden representations from its own feature data and uploads the embedding vectors to the top model held by the label holder for final predictions. This design allows participants to conduct joint training without directly exchanging data. However, existing research points out that malicious participants may still infer label information from the uploaded embeddings, leading to privacy leakage. In this paper, we first propose an embedding extension attack manipulating embeddings to undermine existing defense strategies, which rely on constraining the correlation between the embeddings uploaded by participants and the labels. Subsequently, we propose a new label obfuscation defense strategy, called `LabObf', which randomly maps each original integer-valued label to multiple real-valued soft labels with values intertwined, significantly increasing the difficulty for attackers to infer the labels. We conduct experiments on four different types of datasets, and the results show that LabObf significantly reduces the attacker's success rate compared to raw models while maintaining desirable model accuracy.
翻译:分割神经网络作为纵向联邦学习中最常用的架构之一,因其隐私保护特性而在工业界广受欢迎。在该架构中,由于特征数据不足,持有标签的一方寻求与其他方合作以提升模型性能。每位参与者都拥有一个自定义的底层模型,用于从其自身的特征数据中学习隐藏表示,并将嵌入向量上传至由标签持有方持有的顶层模型以进行最终预测。这种设计使得参与者能够在不直接交换数据的情况下进行联合训练。然而,现有研究指出,恶意参与者仍可能从上传的嵌入中推断出标签信息,从而导致隐私泄露。在本文中,我们首先提出了一种嵌入扩展攻击,通过操纵嵌入来破坏现有的防御策略,这些策略依赖于约束参与者上传的嵌入与标签之间的相关性。随后,我们提出了一种新的标签混淆防御策略,称为`LabObf`,该策略将每个原始的整数值标签随机映射到多个数值交织的实值软标签,从而显著增加了攻击者推断标签的难度。我们在四种不同类型的数据集上进行了实验,结果表明,与原始模型相比,LabObf在保持理想模型精度的同时,显著降低了攻击者的成功率。