With the development of the Internet of Things (IoT), certain IoT devices have the capability to not only accomplish their own tasks but also simultaneously assist other resource-constrained devices. Therefore, this paper considers a device-assisted mobile edge computing system that leverages auxiliary IoT devices to alleviate the computational burden on the edge computing server and enhance the overall system performance. In this study, computationally intensive tasks are decomposed into multiple partitions, and each task partition can be processed in parallel on an IoT device or the edge server. The objective of this research is to develop an efficient online algorithm that addresses the joint optimization of task partitioning and parallel scheduling under time-varying system states, posing challenges to conventional numerical optimization methods. To address these challenges, a framework called online task partitioning action and parallel scheduling policy generation (OTPPS) is proposed, which is based on deep reinforcement learning (DRL). Specifically, the framework leverages a deep neural network (DNN) to learn the optimal partitioning action for each task by mapping input states. Furthermore, it is demonstrated that the remaining parallel scheduling problem exhibits NP-hard complexity when considering a specific task partitioning action. To address this subproblem, a fair and delay-minimized task scheduling (FDMTS) algorithm is designed. Extensive evaluation results demonstrate that OTPPS achieves near-optimal average delay performance and consistently high fairness levels in various environmental states compared to other baseline schemes.
翻译:随着物联网(IoT)的发展,部分IoT设备不仅能完成自身任务,还可同时协助其他资源受限的设备。因此,本文考虑一种设备辅助的移动边缘计算系统,该系统利用辅助IoT设备减轻边缘计算服务器的计算负担,并提升整体系统性能。本研究中,计算密集型任务被分解为多个分区,每个任务分区可在IoT设备或边缘服务器上并行处理。研究目标是开发一种高效的在线算法,以解决时变系统状态下的任务划分与并行调度联合优化问题,这对传统数值优化方法构成了挑战。为应对这些挑战,提出了一种基于深度强化学习的在线任务划分动作与并行调度策略生成框架(OTPPS)。具体而言,该框架利用深度神经网络(DNN)通过映射输入状态来学习每个任务的最优划分动作。进一步证明,在考虑特定任务划分动作时,剩余并行调度问题具有NP-hard复杂度。为解决此子问题,设计了一种公平且时延最小化的任务调度算法(FDMTS)。大量评估结果表明,与基准方案相比,OTPPS在不同环境状态下均能实现接近最优的平均时延性能与一致的高公平性水平。