Combinatorial Client-Master Multiagent Deep Reinforcement Learning for Task Offloading in Mobile Edge Computing

Recently, there has been an explosion of mobile applications that perform computationally intensive tasks such as video streaming, data mining, virtual reality, augmented reality, image processing, video processing, face recognition, and online gaming. However, user devices (UDs), such as tablets and smartphones, have a limited ability to perform the computation needs of the tasks. Mobile edge computing (MEC) has emerged as a promising technology to meet the increasing computing demands of UDs. Task offloading in MEC is a strategy that meets the demands of UDs by distributing tasks between UDs and MEC servers. Deep reinforcement learning (DRL) is gaining attention in task-offloading problems because it can adapt to dynamic changes and minimize online computational complexity. However, the various types of continuous and discrete resource constraints on UDs and MEC servers pose challenges to the design of an efficient DRL-based task-offloading strategy. Existing DRL-based task-offloading algorithms focus on the constraints of the UDs, assuming the availability of enough storage resources on the server. Moreover, existing multiagent DRL (MADRL)--based task-offloading algorithms are homogeneous agents and consider homogeneous constraints as a penalty in their reward function. We proposed a novel combinatorial client-master MADRL (CCM\_MADRL) algorithm for task offloading in MEC (CCM\_MADRL\_MEC) that enables UDs to decide their resource requirements and the server to make a combinatorial decision based on the requirements of the UDs. CCM\_MADRL\_MEC is the first MADRL in task offloading to consider server storage capacity in addition to the constraints in the UDs. By taking advantage of the combinatorial action selection, CCM\_MADRL\_MEC has shown superior convergence over existing MADDPG and heuristic algorithms.

翻译：近年来，视频流媒体、数据挖掘、虚拟现实、增强现实、图像处理、视频处理、人脸识别及在线游戏等计算密集型移动应用呈爆炸式增长。然而，平板电脑和智能手机等用户设备执行这些任务计算需求的能力有限。移动边缘计算作为一种有前景的技术应运而生，以满足用户设备日益增长的计算需求。移动边缘计算中的任务卸载是一种通过在用户设备与边缘服务器之间分配任务来满足用户设备需求的策略。深度强化学习因其能够适应动态变化并降低在线计算复杂度而在任务卸载问题中受到广泛关注。然而，用户设备和边缘服务器上各类连续与离散资源约束给设计高效的基于深度强化学习的任务卸载策略带来了挑战。现有基于深度强化学习的任务卸载算法主要关注用户设备的约束，假设服务器拥有足够的存储资源。此外，现有基于多智能体深度强化学习的任务卸载算法采用同构智能体，并将同构约束作为惩罚项纳入奖励函数。我们提出了一种新颖的组合式客户端-主控多智能体深度强化学习算法，用于边缘计算中的任务卸载，该算法使客户端能够自主决定其资源需求，而主控端则根据客户端需求进行组合决策。这是任务卸载领域首个在用户设备约束之外同时考虑边缘服务器存储容量的多智能体深度强化学习算法。通过利用组合动作选择的优势，该算法在收敛性能上显著优于现有MADDPG及启发式算法。