Spacecraft Autonomous Decision-Planning for Collision Avoidance: a Reinforcement Learning Approach

The space environment around the Earth is becoming increasingly populated by both active spacecraft and space debris. To avoid potential collision events, significant improvements in Space Situational Awareness (SSA) activities and Collision Avoidance (CA) technologies are allowing the tracking and maneuvering of spacecraft with increasing accuracy and reliability. However, these procedures still largely involve a high level of human intervention to make the necessary decisions. For an increasingly complex space environment, this decision-making strategy is not likely to be sustainable. Therefore, it is important to successfully introduce higher levels of automation for key Space Traffic Management (STM) processes to ensure the level of reliability needed for navigating a large number of spacecraft. These processes range from collision risk detection to the identification of the appropriate action to take and the execution of avoidance maneuvers. This work proposes an implementation of autonomous CA decision-making capabilities on spacecraft based on Reinforcement Learning (RL) techniques. A novel methodology based on a Partially Observable Markov Decision Process (POMDP) framework is developed to train the Artificial Intelligence (AI) system on board the spacecraft, considering epistemic and aleatory uncertainties. The proposed framework considers imperfect monitoring information about the status of the debris in orbit and allows the AI system to effectively learn stochastic policies to perform accurate Collision Avoidance Maneuvers (CAMs). The objective is to successfully delegate the decision-making process for autonomously implementing a CAM to the spacecraft without human intervention. This approach would allow for a faster response in the decision-making process and for highly decentralized operations.

翻译：地球周围的空间环境正日益充斥着活跃航天器与空间碎片。为避免潜在碰撞事件，空间态势感知（SSA）活动及碰撞规避（CA）技术的显著进步使得对航天器的跟踪与机动控制更加精确可靠。然而，这些过程仍高度依赖人工干预以做出必要决策。对于日益复杂的空间环境，这种决策策略难以持续。因此，成功引入更高程度的自动化于关键空间交通管理（STM）流程至关重要，以确保在操控大量航天器时所需的可靠性水平。这些流程涵盖从碰撞风险检测、识别应采取的适当行动到执行规避机动。本文提出一种基于强化学习（RL）技术的航天器自主碰撞规避决策能力实现方案。基于部分可观测马尔可夫决策过程（POMDP）框架开发了一种新方法，用于训练航天器搭载的人工智能（AI）系统，并综合考虑认知不确定性及偶然不确定性。该框架考虑轨道碎片状态的监测信息不完善性，使AI系统能够有效学习随机策略以执行精确的碰撞规避机动（CAM）。目标是在无需人工干预的情况下，成功将自主实施碰撞规避机动的决策过程委托给航天器。该方法可加快决策过程响应速度，并实现高度分散化的运行操作。