Device-edge collaboration on deep neural network (DNN) inference is a promising approach to efficiently utilizing network resources for supporting artificial intelligence of things (AIoT) applications. In this paper, we propose a novel digital twin (DT)-assisted approach to device-edge collaboration on DNN inference that determines whether and when to stop local inference at a device and upload the intermediate results to complete the inference on an edge server. Instead of determining the collaboration for each DNN inference task only upon its generation, multi-step decision-making is performed during the on-device inference to adapt to the dynamic computing workload status at the device and the edge server. To enhance the adaptivity, a DT is constructed to evaluate all potential offloading decisions for each DNN inference task, which provides augmented training data for a machine learning-assisted decision-making algorithm. Then, another DT is constructed to estimate the inference status at the device to avoid frequently fetching the status information from the device, thus reducing the signaling overhead. We also derive necessary conditions for optimal offloading decisions to reduce the offloading decision space. Simulation results demon-strate the outstanding performance of our DT-assisted approach in terms of balancing the tradeoff among inference accuracy, delay, and energy consumption.
翻译:深度神经网络(DNN)推理的设备-边缘协同是一种有效利用网络资源以支持物联网人工智能(AIoT)应用的前景广阔的方法。本文提出了一种新颖的数字孪生(DT)辅助方法,用于DNN推理的设备-边缘协同,该方法决定是否以及何时停止设备上的本地推理,并上传中间结果以在边缘服务器上完成推理。与仅在每个DNN推理任务生成时确定其协同策略不同,本方法在设备端推理过程中执行多步决策,以适应设备和边缘服务器动态变化的计算负载状态。为了增强自适应性,我们构建了一个数字孪生来评估每个DNN推理任务所有潜在的卸载决策,这为机器学习辅助的决策算法提供了增强的训练数据。随后,构建了另一个数字孪生来估计设备端的推理状态,以避免频繁从设备获取状态信息,从而降低信令开销。我们还推导了最优卸载决策的必要条件,以缩减卸载决策的搜索空间。仿真结果表明,我们提出的数字孪生辅助方法在平衡推理精度、延迟和能耗之间的权衡方面表现优异。