Split learning (SL) is a new collaborative learning technique that allows participants, e.g. a client and a server, to train machine learning models without the client sharing raw data. In this setting, the client initially applies its part of the machine learning model on the raw data to generate Activation Maps (AMs) and then sends them to the server to continue the training process. Previous works in the field demonstrated that reconstructing AMs could result in privacy leakage of client data. In addition to that, existing mitigation techniques that overcome the privacy leakage of SL prove to be significantly worse in terms of accuracy. In this paper, we improve upon previous works by constructing a protocol based on U-shaped SL that can operate on homomorphically encrypted data. More precisely, in our approach, the client applies homomorphic encryption on the AMs before sending them to the server, thus protecting user privacy. This is an important improvement that reduces privacy leakage in comparison to other SL-based works. Finally, our results show that, with the optimum set of parameters, training with HE data in the U-shaped SL setting only reduces accuracy by 2.65% compared to training on plaintext. In addition, raw training data privacy is preserved.
翻译:切分学习(Split Learning, SL)是一种新型协作学习技术,允许参与者(如客户端和服务器)在不共享原始数据的情况下训练机器学习模型。在该设置中,客户端首先将机器学习模型的部分应用于原始数据,生成激活图(Activation Maps, AMs),然后将其发送至服务器以继续训练过程。此前领域内的研究表明,重建激活图可能导致客户端数据的隐私泄露。此外,现有缓解切分学习隐私泄露的防护技术在准确性方面显著较差。本文在先前工作基础上,构建了一种基于U形切分学习(U-shaped SL)的协议,可对同态加密数据执行操作。具体而言,我们的方法中,客户端在将激活图发送至服务器前对其应用同态加密,从而保护用户隐私。这一重要改进相较于其他基于切分学习的工作降低了隐私泄露。最后,我们的结果表明,在最优参数集下,使用同态加密数据在U形切分学习设置中训练的准确率仅比明文训练降低2.65%,同时原始训练数据的隐私得到保护。