As space becomes increasingly crowded and contested, robust autonomous capabilities for multi-agent environments are gaining critical importance. Current autonomous systems in space primarily rely on optimization-based path planning or long-range orbital maneuvers, which have not yet proven effective in adversarial scenarios where one satellite is actively pursuing another. We introduce Divergent Adversarial Reinforcement Learning (DARL), a two-stage Multi-Agent Reinforcement Learning (MARL) approach designed to train autonomous evasion strategies for satellites engaged with multiple adversarial spacecraft. Our method enhances exploration during training by promoting diverse adversarial strategies, leading to more robust and adaptable evader models. We validate DARL through a cat-and-mouse satellite scenario, modeled as a partially observable multi-agent capture the flag game where two adversarial ``cat" spacecraft pursue a single ``mouse" evader. DARL's performance is compared against several benchmarks, including an optimization-based satellite path planner, demonstrating its ability to produce highly robust models for adversarial multi-agent space environments.
翻译:随着太空环境日益拥挤和对抗化,多智能体环境中的自主能力变得至关重要。当前太空中的自主系统主要依赖基于优化的路径规划或远程轨道机动,但在一个卫星主动追逐另一卫星的对抗场景中尚未证明其有效性。我们提出了一种名为对抗性强化学习(DARL)的两阶段多智能体强化学习方法,旨在训练卫星在多对抗航天器环境中的自主规避策略。该方法通过促进多样化的对抗策略来增强训练过程中的探索性,从而生成更鲁棒和适应性更强的规避模型。我们通过一个猫鼠卫星场景验证了DARL的有效性,该场景被建模为部分可观测的多智能体夺旗游戏,其中两个对抗性“猫”航天器追逐一个“鼠”规避器。与包括基于优化的卫星路径规划器在内的多种基准方法相比,DARL展现了其在对抗性多智能体太空环境中生成高度鲁棒模型的能力。