In split inference, a deep neural network (DNN) is partitioned to run the early part of the DNN at the edge and the later part of the DNN in the cloud. This meets two key requirements for on-device machine learning: input privacy and computation efficiency. Still, an open question in split inference is output privacy, given that the outputs of the DNN are observable in the cloud. While encrypted computing can protect output privacy too, homomorphic encryption requires substantial computation and communication resources from both edge and cloud devices. In this paper, we introduce Salted DNNs: a novel approach that enables clients at the edge, who run the early part of the DNN, to control the semantic interpretation of the DNN's outputs at inference time. Our proposed Salted DNNs maintain classification accuracy and computation efficiency very close to the standard DNN counterparts. Experimental evaluations conducted on both images and wearable sensor data demonstrate that Salted DNNs attain classification accuracy very close to standard DNNs, particularly when the Salted Layer is positioned within the early part to meet the requirements of split inference. Our approach is general and can be applied to various types of DNNs. As a benchmark for future studies, we open-source our code.
翻译:在分割推理中,深度神经网络(DNN)被分割,其早期部分在边缘设备上运行,后期部分在云端运行。这满足了设备端机器学习的两项关键需求:输入隐私和计算效率。然而,分割推理中的一个开放问题是输出隐私,因为DNN的输出在云端是可观察的。虽然加密计算也能保护输出隐私,但同态加密需要边缘和云端设备都投入大量的计算和通信资源。在本文中,我们引入了加盐DNN:一种新颖的方法,使得在边缘运行DNN早期部分的客户端能够在推理时控制DNN输出的语义解释。我们提出的加盐DNN在保持分类精度和计算效率方面与标准DNN非常接近。在图像和可穿戴传感器数据上进行的实验评估表明,加盐DNN能够达到与标准DNN非常接近的分类精度,尤其是当加盐层位于早期部分以满足分割推理需求时。我们的方法具有通用性,可应用于各种类型的DNN。作为未来研究的基准,我们开源了代码。