This paper investigates a wireless powered mobile edge computing (WP-MEC) network with multiple hybrid access points (HAPs) in a dynamic environment, where wireless devices (WDs) harvest energy from radio frequency (RF) signals of HAPs, and then compute their computation data locally (i.e., local computing mode) or offload it to the chosen HAPs (i.e., edge computing mode). In order to pursue a green computing design, we formulate an optimization problem that minimizes the long-term energy provision of the WP-MEC network subject to the energy, computing delay and computation data demand constraints. The transmit power of HAPs, the duration of the wireless power transfer (WPT) phase, the offloading decisions of WDs, the time allocation for offloading and the CPU frequency for local computing are jointly optimized adapting to the time-varying generated computation data and wireless channels of WDs. To efficiently address the formulated non-convex mixed integer programming (MIP) problem in a distributed manner, we propose a Two-stage Multi-Agent deep reinforcement learning-based Distributed computation Offloading (TMADO) framework, which consists of a high-level agent and multiple low-level agents. The high-level agent residing in all HAPs optimizes the transmit power of HAPs and the duration of the WPT phase, while each low-level agent residing in each WD optimizes its offloading decision, time allocation for offloading and CPU frequency for local computing. Simulation results show the superiority of the proposed TMADO framework in terms of the energy provision minimization.
翻译:本文研究动态环境下具有多个混合接入点(HAP)的无线供能移动边缘计算(WP-MEC)网络,其中无线设备(WD)从HAP的射频(RF)信号中采集能量,随后在本地计算其计算数据(即本地计算模式)或将数据卸载至选定的HAP(即边缘计算模式)。为实现绿色计算设计,我们构建了一个优化问题,旨在最小化WP-MEC网络的长期能量供给,同时满足能量、计算延迟和计算数据需求约束。该问题联合优化HAP的发射功率、无线能量传输(WPT)阶段的持续时间、WD的卸载决策、卸载时间分配以及本地计算的CPU频率,以适应WD时变的计算数据生成量与无线信道状态。为以分布式方式高效求解该非凸混合整数规划(MIP)问题,我们提出了一种基于两阶段多智能体深度强化学习的分布式计算卸载(TMADO)框架,该框架包含一个高层智能体与多个低层智能体。部署于所有HAP的高层智能体负责优化HAP的发射功率和WPT阶段的持续时间,而部署于每个WD的低层智能体则分别优化其卸载决策、卸载时间分配以及本地计算的CPU频率。仿真结果表明,所提出的TMADO框架在能量供给最小化方面具有优越性能。