Hierarchical Online-Scheduling for Energy-Efficient Split Inference with Progressive Transmission

Device-edge collaborative inference with Deep Neural Networks (DNNs) faces fundamental trade-offs among accuracy, latency and energy consumption. Current scheduling exhibits two drawbacks: a granularity mismatch between coarse, task-level decisions and fine-grained, packet-level channel dynamics, and insufficient awareness of per-task complexity. Consequently, scheduling solely at the task level leads to inefficient resource utilization. This paper proposes a novel ENergy-ACcuracy Hierarchical optimization framework for split Inference, named ENACHI, that jointly optimizes task- and packet-level scheduling to maximize accuracy under energy and delay constraints. A two-tier Lyapunov-based framework is developed for ENACHI, with a progressive transmission technique further integrated to enhance adaptivity. At the task level, an outer drift-plus-penalty loop makes online decisions for DNN partitioning and bandwidth allocation, and establishes a reference power budget to manage the long-term energy-accuracy trade-off. At the packet level, an uncertainty-aware progressive transmission mechanism is employed to adaptively manage per-sample task complexity. This is integrated with a nested inner control loop implementing a novel reference-tracking policy, which dynamically adjusts per-slot transmit power to adapt to fluctuating channel conditions. Experiments on ImageNet dataset demonstrate that ENACHI outperforms state-of-the-art benchmarks under varying deadlines and bandwidths, achieving a 43.12\% gain in inference accuracy with a 62.13\% reduction in energy consumption under stringent deadlines, and exhibits high scalability by maintaining stable energy consumption in congested multi-user scenarios.

翻译：基于深度神经网络（DNN）的设备-边缘协同推理在精度、延迟和能耗之间存在根本性权衡。现有调度方法存在两个主要缺陷：一是粗粒度的任务级决策与细粒度的数据包级信道动态之间存在粒度失配；二是对任务自身复杂度的感知不足。因此，仅进行任务级调度会导致资源利用效率低下。本文提出了一种新颖的面向分步推理的能耗-精度分层优化框架ENACHI，该框架通过联合优化任务级与数据包级调度，在给定能耗与延迟约束下最大化推理精度。我们为ENACHI开发了一个基于李雅普诺夫的双层框架，并进一步集成了渐进传输技术以增强自适应性。在任务层面，外层漂移加惩罚循环在线决策DNN划分与带宽分配，并建立参考功率预算以管理长期的能耗-精度权衡。在数据包层面，采用一种不确定性感知的渐进传输机制来自适应地管理每个样本的任务复杂度。该机制与一个嵌套的内层控制循环相结合，该循环实现了一种新颖的参考跟踪策略，能够动态调整每时隙的发射功率以适应波动的信道条件。在ImageNet数据集上的实验表明，ENACHI在不同截止时间和带宽条件下均优于现有先进基准方法，在严格截止时间下实现了推理精度43.12%的提升和能耗62.13%的降低，并在拥塞的多用户场景中通过保持稳定的能耗表现出高度的可扩展性。