With the popularity of cloud computing and machine learning, it has been a trend to outsource machine learning processes (including model training and model-based inference) to cloud. By the outsourcing, other than utilizing the extensive and scalable resource offered by the cloud service provider, it will also be attractive to users if the cloud servers can manage the machine learning processes autonomously on behalf of the users. Such a feature will be especially salient when the machine learning is expected to be a long-term continuous process and the users are not always available to participate. Due to security and privacy concerns, it is also desired that the autonomous learning preserves the confidentiality of users' data and models involved. Hence, in this paper, we aim to design a scheme that enables autonomous and confidential model refining in cloud. Homomorphic encryption and trusted execution environment technology can protect confidentiality for autonomous computation, but each of them has their limitations respectively and they are complementary to each other. Therefore, we further propose to integrate these two techniques in the design of the model refining scheme. Through implementation and experiments, we evaluate the feasibility of our proposed scheme. The results indicate that, with our proposed scheme the cloud server can autonomously refine an encrypted model with newly provided encrypted training data to continuously improve its accuracy. Though the efficiency is still significantly lower than the baseline scheme that refines plaintext-model with plaintext-data, we expect that it can be improved by fully utilizing the higher level of parallelism and the computational power of GPU at the cloud server.
翻译:随着云计算和机器学习的普及,将机器学习过程(包括模型训练和基于模型的推理)外包至云端已成为趋势。通过这种外包方式,用户不仅能充分利用云服务商提供的广泛且可扩展的资源,若云服务器能够代表用户自主管理机器学习进程,也将对用户产生巨大吸引力。当机器学习被视为长期持续过程且用户无法随时参与时,这一特性尤为突出。出于安全和隐私考虑,人们还希望自主学习过程能保护用户数据和模型的机密性。因此,本文旨在设计一种能够在云中实现自主且保密模型优化的方案。同态加密和可信执行环境技术虽能保护自主计算的机密性,但两者各有局限性且互为补充。我们进一步提出将这两种技术集成到模型优化方案的设计中。通过实现和实验,我们评估了所提方案的可行性。结果表明,采用本方案,云服务器可利用新提供的加密训练数据自主优化加密模型,持续提升其准确率。尽管其效率仍显著低于使用明文数据和明文模型进行优化的基准方案,但我们预期通过充分利用云服务器GPU的计算能力及更高程度的并行性可改善这一问题。