Terahertz communication networks and intelligent reflecting surfaces exhibit significant potential in advancing wireless networks, particularly within the domain of aerial-based multi-access edge computing systems. These technologies enable efficient offloading of computational tasks from user electronic devices to Unmanned Aerial Vehicles or local execution. For the generation of high-quality task-offloading allocations, conventional numerical optimization methods often struggle to solve challenging combinatorial optimization problems within the limited channel coherence time, thereby failing to respond quickly to dynamic changes in system conditions. To address this challenge, we propose a deep learning-based optimization framework called Iterative Order-Preserving policy Optimization (IOPO), which enables the generation of energy-efficient task-offloading decisions within milliseconds. Unlike exhaustive search methods, IOPO provides continuous updates to the offloading decisions without resorting to exhaustive search, resulting in accelerated convergence and reduced computational complexity, particularly when dealing with complex problems characterized by extensive solution spaces. Experimental results demonstrate that the proposed framework can generate energy-efficient task-offloading decisions within a very short time period, outperforming other benchmark methods.
翻译:太赫兹通信网络与智能反射面在推进无线网络发展方面展现出显著潜力,特别是在基于空中平台的多接入边缘计算系统领域。这些技术能够高效地将用户电子设备的计算任务卸载至无人机或本地执行。为了生成高质量的任务卸载分配方案,传统数值优化方法往往难以在有限的信道相干时间内解决复杂的组合优化问题,从而无法快速响应系统条件的动态变化。针对这一挑战,我们提出了一种基于深度学习的优化框架——迭代保序策略优化(IOPO),它能够在毫秒级时间内生成能效优化的任务卸载决策。与穷举搜索方法不同,IOPO无需进行穷举搜索即可持续更新卸载决策,从而在应对具有广阔解空间的复杂问题时,实现更快的收敛速度和更低的计算复杂度。实验结果表明,所提出的框架能够在极短时间内生成能效优化的任务卸载决策,性能优于其他基准方法。