Deep Neural Networks (DNNs) have significantly improved the accuracy of intelligent applications on mobile devices. DNN surgery, which partitions DNN processing between mobile devices and multi-access edge computing (MEC) servers, can enable real-time inference despite the computational limitations of mobile devices. However, DNN surgery faces a critical challenge: determining the optimal computing resource demand from the server and the corresponding partition strategy, while considering both inference latency and MEC server usage costs. This problem is compounded by two factors: (1) the finite computing capacity of the MEC server, which is shared among multiple devices, leading to inter-dependent demands, and (2) the shift in modern DNN architecture from chains to directed acyclic graphs (DAGs), which complicates potential solutions. In this paper, we introduce a novel Decentralized DNN Surgery (DDS) framework. We formulate the partition strategy as a min-cut and propose a resource allocation game to adaptively schedule the demands of mobile devices in an MEC environment. We prove the existence of a Nash Equilibrium (NE), and develop an iterative algorithm to efficiently reach the NE for each device. Our extensive experiments demonstrate that DDS can effectively handle varying MEC scenarios, achieving up to 1.25$\times$ acceleration compared to the state-of-the-art algorithm.
翻译:深度神经网络(DNN)显著提升了移动设备上智能应用的准确性。通过将DNN处理任务在移动设备与多接入边缘计算(MEC)服务器之间进行分割,DNN手术能够在移动设备计算能力受限的情况下实现实时推理。然而,DNN手术面临一个关键挑战:在兼顾推理延迟与MEC服务器使用成本的同时,确定服务器的最优计算资源需求及相应的分割策略。该问题因以下两个因素而变得复杂:(1)MEC服务器有限的计算能力由多台设备共享,导致需求相互依赖;(2)现代DNN架构从链式结构向有向无环图(DAG)的转变,使潜在解决方案更加复杂。本文提出了一种新颖的去中心化DNN手术(DDS)框架。我们将分割策略建模为最小割问题,并设计了一种资源分配博弈方法,以自适应调度MEC环境中移动设备的资源需求。我们证明了纳什均衡(NE)的存在性,并开发了一种迭代算法使每台设备高效收敛至NE。大量实验表明,DDS能有效应对多样化的MEC场景,与现有最优算法相比,可实现高达1.25倍的加速比。