Distributed deep learning frameworks enable more efficient and privacy-aware training of deep neural networks across multiple clients. Split learning achieves this by splitting a neural network between a client and a server such that the client computes the initial set of layers, and the server computes the rest. However, this method introduces a unique attack vector for a malicious server attempting to recover the client's private inputs: the server can direct the client model towards learning any task of its choice, e.g. towards outputting easily invertible values. With a concrete example already proposed (Pasquini et al., ACM CCS '21), such \textit{training-hijacking} attacks present a significant risk for the data privacy of split learning clients. We propose two methods for a split learning client to detect if it is being targeted by a training-hijacking attack or not. We experimentally evaluate our methods' effectiveness, compare them with other potential solutions, and discuss various points related to their use. Our conclusion is that by using the method that best suits their use case, split learning clients can consistently detect training-hijacking attacks and thus keep the information gained by the attacker at a minimum.
翻译:分布式深度学习框架使得跨多个客户端更高效且保护隐私地训练深度神经网络成为可能。分裂学习通过将神经网络在客户端和服务器之间分割来实现这一点,其中客户端计算初始层集合,而服务器计算其余层。然而,这种方法为恶意服务器试图恢复客户端私有输入引入了一种独特的攻击向量:服务器可以引导客户端模型学习其选择的任何任务,例如输出易于逆变换的值。随着一个具体示例的提出(Pasquini等人,ACM CCS '21),此类训练劫持攻击对分裂学习客户端的数据隐私构成了重大风险。我们提出了两种方法,供分裂学习客户端检测是否正在遭受训练劫持攻击。我们通过实验评估了这些方法的有效性,将它们与其他潜在解决方案进行了比较,并讨论了与它们使用相关的各种要点。我们的结论是,通过使用最适合其用例的方法,分裂学习客户端能够持续检测训练劫持攻击,从而将攻击者获取的信息降至最低。