Split learning enables efficient and privacy-aware training of a deep neural network by splitting a neural network so that the clients (data holders) compute the first layers and only share the intermediate output with the central compute-heavy server. This paradigm introduces a new attack medium in which the server has full control over what the client models learn, which has already been exploited to infer the private data of clients and to implement backdoors in the client models. Although previous work has shown that clients can successfully detect such training-hijacking attacks, the proposed methods rely on heuristics, require tuning of many hyperparameters, and do not fully utilize the clients' capabilities. In this work, we show that given modest assumptions regarding the clients' compute capabilities, an out-of-the-box outlier detection method can be used to detect existing training-hijacking attacks with almost-zero false positive rates. We conclude through experiments on different tasks that the simplicity of our approach we name SplitOut makes it a more viable and reliable alternative compared to the earlier detection methods.
翻译:摘要:分割学习通过拆分深度神经网络,使客户端(数据持有者)仅计算前几层网络并将中间输出共享给中央计算密集型服务器,从而实现高效且保护隐私的模型训练。该范式引入了一种新型攻击媒介——服务器可完全控制客户端模型的学习过程,该能力已被用于推断客户端私有数据及在客户端模型中植入后门。尽管先前研究表明客户端能够成功检测此类训练劫持攻击,但现有方法依赖启发式规则、需要调整大量超参数,且未充分利用客户端计算能力。本文证明:在客户端计算能力适度假设下,采用即用型离群点检测方法即可实现近乎零误报率的训练劫持攻击检测。通过不同任务上的实验,我们得出结论:所提方法SplitOut因其简洁性,相较于现有检测方法更具可行性与可靠性。