Task-Oriented Computation Offloading for Edge Inference: An Integrated Bayesian Optimization and Deep Reinforcement Learning Framework

Edge intelligence (EI) allows resource-constrained edge devices (EDs) to offload computation-intensive AI tasks (e.g., visual object detection) to edge servers (ESs) for fast execution. However, transmitting high-volume raw task data (e.g., 4K video) over bandwidth-limited wireless networks incurs significant latency. While EDs can reduce transmission latency by degrading data before transmission (e.g., reducing resolution from 4K to 720p or 480p), it often deteriorates inference accuracy, creating a critical accuracy-latency tradeoff. The difficulty in balancing this tradeoff stems from the absence of closed-form models capturing content-dependent accuracy-latency relationships. Besides, under bandwidth sharing constraints, the discrete degradation decisions among the EDs demonstrate inherent combinatorial complexity. Mathematically, it requires solving a challenging \textit{black-box} mixed-integer nonlinear programming (MINLP). To address this problem, we propose LAB, a novel learning framework that seamlessly integrates deep reinforcement learning (DRL) and Bayesian optimization (BO). Specifically, LAB employs: (a) a DNN-based actor that maps input system state to degradation actions, directly addressing the combinatorial complexity of the MINLP; and (b) a BO-based critic with an explicit model built from fitting a Gaussian process surrogate with historical observations, enabling model-based evaluation of degradation actions. For each selected action, optimal bandwidth allocation is then efficiently derived via convex optimization. Numerical evaluations on real-world self-driving datasets demonstrate that LAB achieves near-optimal accuracy-latency tradeoff, exhibiting only 1.22\% accuracy degradation and 0.07s added latency compared to exhaustive search... The complete source code for LAB is available at https://github.com/Ethan-Xian-Li/LAB.

翻译：边缘智能（EI）使资源受限的边缘设备（ED）能够将计算密集型的AI任务（例如视觉目标检测）卸载到边缘服务器（ES）以快速执行。然而，在带宽受限的无线网络上传输大容量的原始任务数据（例如4K视频）会产生显著的延迟。虽然边缘设备可以通过在传输前降低数据质量（例如将分辨率从4K降至720p或480p）来减少传输延迟，但这通常会降低推理精度，从而形成一个关键的精度-延迟权衡难题。平衡这一权衡的困难源于缺乏能够捕捉内容依赖的精度-延迟关系的闭式模型。此外，在带宽共享约束下，各边缘设备间的离散降质决策呈现出固有的组合复杂性。从数学角度，这需要求解一个具有挑战性的\textit{黑箱}混合整数非线性规划（MINLP）问题。为解决此问题，我们提出了LAB，一种新颖的学习框架，无缝集成了深度强化学习（DRL）与贝叶斯优化（BO）。具体而言，LAB采用：（a）一个基于深度神经网络的行动者，将输入系统状态映射为降质动作，直接应对MINLP的组合复杂性；以及（b）一个基于BO的评论者，通过拟合高斯过程代理模型并利用历史观测数据构建显式模型，从而实现对降质动作的基于模型的评估。对于每个选定的动作，最优带宽分配随后通过凸优化高效推导得出。在真实世界自动驾驶数据集上的数值评估表明，LAB实现了接近最优的精度-延迟权衡，与穷举搜索相比，仅产生1.22\%的精度下降和0.07秒的额外延迟……LAB的完整源代码可在https://github.com/Ethan-Xian-Li/LAB获取。