Efficient robotic extraterrestrial exploration requires robots with diverse capabilities, ranging from scientific measurement tools to advanced locomotion. A robotic team enables the distribution of tasks over multiple specialized subsystems, each providing specific expertise to complete the mission. The central challenge lies in efficiently coordinating the team to maximize utilization and the extraction of scientific value. Classical planning algorithms scale poorly with problem size, leading to long planning cycles and high inference costs due to the combinatorial growth of possible robot-target allocations and possible trajectories. Learning-based methods are a viable alternative that move the scaling concern from runtime to training time, setting a critical step towards achieving real-time planning. In this work, we present a collaborative planning strategy based on Multi-Agent Proximal Policy Optimization (MAPPO) to coordinate a team of heterogeneous robots to solve a complex target allocation and scheduling problem. We benchmark our approach against single-objective optimal solutions obtained through exhaustive search and evaluate its ability to perform online replanning in the context of a planetary exploration scenario.
翻译:高效的机器人地外探索需要机器人具备多样化的能力,涵盖科学测量工具到先进移动能力。机器人团队能够将任务分配至多个专业化子系统,每个子系统提供特定专业知识以完成使命。核心挑战在于高效协调团队,以最大化资源利用并获取科学价值。经典规划算法随问题规模增大而扩展性不足,由于机器人-目标分配及可能轨迹的组合爆炸式增长,导致规划周期延长和推理成本高企。基于学习的方法提供了一种可行替代方案,将扩展性问题从运行时转移至训练阶段,为实现实时规划迈出关键一步。本文提出一种基于多智能体近端策略优化(MAPPO)的协同规划策略,用于协调异构机器人团队解决复杂的目标分配与调度问题。我们将所提方法与通过穷举搜索获得的单目标最优解进行基准对比,并在行星探测场景中评估其在线重规划能力。