Edge intelligence enables resource-demanding Deep Neural Network (DNN) inference without transferring original data, addressing concerns about data privacy in consumer Internet of Things (IoT) devices. For privacy-sensitive applications, deploying models in hardware-isolated trusted execution environments (TEEs) becomes essential. However, the limited secure memory in TEEs poses challenges for deploying DNN inference, and alternative techniques like model partitioning and offloading introduce performance degradation and security issues. In this paper, we present a novel approach for advanced model deployment in TrustZone that ensures comprehensive privacy preservation during model inference. We design a memory-efficient management method to support memory-demanding inference in TEEs. By adjusting the memory priority, we effectively mitigate memory leakage risks and memory overlap conflicts, resulting in 32 lines of code alterations in the trusted operating system. Additionally, we leverage two tiny libraries: S-Tinylib (2,538 LoCs), a tiny deep learning library, and Tinylibm (827 LoCs), a tiny math library, to support efficient inference in TEEs. We implemented a prototype on Raspberry Pi 3B+ and evaluated it using three well-known lightweight DNN models. The experimental results demonstrate that our design significantly improves inference speed by 3.13 times and reduces power consumption by over 66.5% compared to non-memory optimization method in TEEs.
翻译:边缘智能使资源密集型深度神经网络(DNN)推理无需传输原始数据即可完成,从而解决了消费物联网设备中的数据隐私问题。对于隐私敏感型应用,在硬件隔离的可信执行环境(TEE)中部署模型变得至关重要。然而,TEE中有限的安全内存给DNN推理部署带来挑战,而模型分区和卸载等替代技术则会导致性能下降和安全隐患。本文提出一种在TrustZone中部署先进模型的新方法,可在模型推理过程中实现全面隐私保护。我们设计了一种内存高效管理方法,以支持TEE中内存密集型推理。通过调整内存优先级,有效缓解了内存泄漏风险和内存重叠冲突,仅需修改可信操作系统中的32行代码。此外,我们利用两个微型库:S-Tinylib(2,538行代码)微型深度学习库和Tinylibm(827行代码)微型数学库,以支持TEE中的高效推理。我们在Raspberry Pi 3B+上实现原型,并使用三种知名轻量级DNN模型进行评估。实验结果表明,与TEE中未采用内存优化的方法相比,我们的设计将推理速度提升3.13倍,功耗降低超过66.5%。