This paper presents a Hierarchical Reinforcement Learning methodology tailored for optimizing CubeSat task scheduling in Low Earth Orbits (LEO). Incorporating a high-level policy for global task distribution and a low-level policy for real-time adaptations as a safety mechanism, our approach integrates the Similarity Attention-based Encoder (SABE) for task prioritization and an MLP estimator for energy consumption forecasting. Integrating this mechanism creates a safe and fault-tolerant system for CubeSat task scheduling. Simulation results validate the Hierarchical Reinforcement Learning superior convergence and task success rate, outperforming both the MADDPG model and traditional random scheduling across multiple CubeSat configurations.
翻译:本文提出一种面向近地轨道(LEO)立方星任务调度优化的分层强化学习框架。该方法通过高层策略实现全局任务分配,并引入低层策略作为安全机制进行实时自适应调整,同时融合基于相似性注意力的编码器(SABE)进行任务优先级排序,以及MLP估计器进行能耗预测。该机制构建了安全容错的立方星任务调度系统。仿真结果验证了分层强化学习在收敛速度与任务成功率上的显著优势,在多种立方星配置下均优于MADDPG模型与传统随机调度方法。