In the past decade, Deep Neural Networks (DNNs) achieved state-of-the-art performance in a broad range of problems, spanning from object classification and action recognition to smart building and healthcare. The flexibility that makes DNNs such a pervasive technology comes at a price: the computational requirements preclude their deployment on most of the resource-constrained edge devices available today to solve real-time and real-world tasks. This paper introduces a novel approach to address this challenge by combining the concept of predefined sparsity with Split Computing (SC) and Early Exit (EE). In particular, SC aims at splitting a DNN with a part of it deployed on an edge device and the rest on a remote server. Instead, EE allows the system to stop using the remote server and rely solely on the edge device's computation if the answer is already good enough. Specifically, how to apply such a predefined sparsity to a SC and EE paradigm has never been studied. This paper studies this problem and shows how predefined sparsity significantly reduces the computational, storage, and energy burdens during the training and inference phases, regardless of the hardware platform. This makes it a valuable approach for enhancing the performance of SC and EE applications. Experimental results showcase reductions exceeding 4x in storage and computational complexity without compromising performance. The source code is available at https://github.com/intelligolabs/sparsity_sc_ee.
翻译:在过去的十年中,深度神经网络(DNNs)在从对象分类、动作识别到智能建筑和医疗保健等广泛问题上实现了最先进的性能。然而,使DNNs成为如此普遍技术的灵活性是有代价的:其计算需求阻碍了其在当前大多数资源受限的边缘设备上部署,以解决实时和现实世界的任务。本文提出了一种新颖的方法,通过将预定义稀疏性概念与拆分计算(SC)和提前退出(EE)相结合来应对这一挑战。具体而言,SC旨在将DNN的一部分部署在边缘设备上,其余部分部署在远程服务器上。相反,EE允许系统在答案已经足够好的情况下停止使用远程服务器,仅依赖边缘设备的计算。特别是,如何将这种预定义稀疏性应用于SC和EE范式尚未得到研究。本文研究了这一问题,并展示了预定义稀疏性如何在训练和推理阶段显著降低计算、存储和能量负担,且与硬件平台无关。这使其成为增强SC和EE应用性能的有效方法。实验结果表明,在不影响性能的情况下,存储和计算复杂度降低了超过4倍。源代码可在 https://github.com/intelligolabs/sparsity_sc_ee 获取。