As a novel privacy-preserving paradigm aimed at reducing client computational costs and achieving data utility, split learning has garnered extensive attention and proliferated widespread applications across various fields, including smart health and smart transportation, among others. While recent studies have primarily concentrated on addressing privacy leakage concerns in split learning, such as inference attacks and data reconstruction, the exploration of security issues (e.g., backdoor attacks) within the framework of split learning has been comparatively limited. Nonetheless, the security vulnerability within the context of split learning is highly posing a threat and can give rise to grave security implications, such as the illegal impersonation in the face recognition model. Therefore, in this paper, we propose a stealthy backdoor attack strategy (namely SBAT) tailored to the without-label-sharing split learning architecture, which unveils the inherent security vulnerability of split learning. We posit the existence of a potential attacker on the server side aiming to introduce a backdoor into the training model, while exploring two scenarios: one with known client network architecture and the other with unknown architecture. Diverging from traditional backdoor attack methods that manipulate the training data and labels, we constructively conduct the backdoor attack by injecting the trigger embedding into the server network. Specifically, our SBAT achieves a higher level of attack stealthiness by refraining from modifying any intermediate parameters (e.g., gradients) during training and instead executing all malicious operations post-training.
翻译:作为一种旨在降低客户端计算成本并实现数据效用的新型隐私保护范式,分割学习已获得广泛关注,并在智能医疗、智能交通等多个领域得到大量应用。尽管近期研究主要集中在解决分割学习中的隐私泄露问题(如推理攻击和数据重建),但对其框架内安全问题的探索(例如后门攻击)相对有限。然而,分割学习背景下的安全漏洞极具威胁性,可能引发严重的安全后果,例如人脸识别模型中的非法身份冒充。因此,本文提出一种针对无标签共享分割学习架构的隐蔽后门攻击策略(称为SBAT),揭示了分割学习固有的安全漏洞。我们假设服务器端存在潜在攻击者,旨在向训练模型中植入后门,同时探索了两种场景:一种为已知客户端网络架构,另一种为未知架构。与操纵训练数据和标签的传统后门攻击方法不同,我们通过将触发器嵌入注入服务器网络来建设性地实施后门攻击。具体而言,我们的SBAT通过避免在训练期间修改任何中间参数(如梯度),而是将所有恶意操作在训练后执行,从而实现了更高水平的攻击隐蔽性。