Split learning (SL) is a new collaborative learning technique that allows participants, e.g. a client and a server, to train machine learning models without the client sharing raw data. In this setting, the client initially applies its part of the machine learning model on the raw data to generate activation maps and then sends them to the server to continue the training process. Previous works in the field demonstrated that reconstructing activation maps could result in privacy leakage of client data. In addition to that, existing mitigation techniques that overcome the privacy leakage of SL prove to be significantly worse in terms of accuracy. In this paper, we improve upon previous works by constructing a protocol based on U-shaped SL that can operate on homomorphically encrypted data. More precisely, in our approach, the client applies homomorphic encryption on the activation maps before sending them to the server, thus protecting user privacy. This is an important improvement that reduces privacy leakage in comparison to other SL-based works. Finally, our results show that, with the optimum set of parameters, training with HE data in the U-shaped SL setting only reduces accuracy by 2.65% compared to training on plaintext. In addition, raw training data privacy is preserved.
翻译:分割学习(SL)是一种新型的协作学习技术,允许参与者(例如客户端和服务器)在不共享原始数据的情况下训练机器学习模型。在此设置中,客户端首先将其机器学习模型部分应用于原始数据以生成激活图,然后将这些激活图发送给服务器以继续训练过程。先前的研究表明,重构激活图可能导致客户端数据的隐私泄露。此外,现有的缓解SL隐私泄露的技术在准确性方面明显较差。在本文中,我们通过构建一个基于U形SL的协议来改进先前的工作,该协议能够处理同态加密数据。更具体地说,在我们的方法中,客户端在将激活图发送给服务器之前对其应用同态加密,从而保护用户隐私。这是一个重要的改进,与其他基于SL的工作相比,减少了隐私泄露。最后,我们的结果表明,在最优参数集下,在U形SL设置中使用同态加密数据进行训练仅比在明文上训练准确率降低2.65%。此外,原始训练数据的隐私得以保留。