Split learning enables efficient and privacy-aware training of a deep neural network by splitting a neural network so that the clients (data holders) compute the first layers and only share the intermediate output with the central compute-heavy server. This paradigm introduces a new attack medium in which the server has full control over what the client models learn, which has already been exploited to infer the private data of clients and to implement backdoors in the client models. Although previous work has shown that clients can successfully detect such training-hijacking attacks, the proposed methods rely on heuristics, require tuning of many hyperparameters, and do not fully utilize the clients' capabilities. In this work, we show that given modest assumptions regarding the clients' compute capabilities, an out-of-the-box outlier detection method can be used to detect existing training-hijacking attacks with almost-zero false positive rates. We conclude through experiments on different tasks that the simplicity of our approach we name \textit{SplitOut} makes it a more viable and reliable alternative compared to the earlier detection methods.
翻译:分割学习通过将深度神经网络分割,使得客户端(数据持有者)计算前几层,仅将中间输出与计算密集的中心服务器共享,从而实现高效且注重隐私的神经网络训练。这一范式引入了一种新的攻击媒介:服务器可完全控制客户端模型的学习内容,该漏洞已被用于推断客户端的私有数据并在客户端模型中植入后门。尽管先前研究已证明客户端能够成功检测此类训练劫持攻击,但所提方法依赖启发式规则、需调整大量超参数,且未充分利用客户端的检测能力。本研究表明,在合理假设客户端计算能力的前提下,采用即用型离群值检测方法即可检测现有训练劫持攻击,且误报率趋近于零。通过在不同任务上的实验验证,我们提出的\textit{SplitOut}方法因其简洁性,相较于早期检测方案更具可行性与可靠性。