Autonomous robots are being employed in several mapping and data collection tasks due to their efficiency and low labor costs. In these tasks, the robots are required to map targets-of-interest in an unknown environment while constrained to a given resource budget such as path length or mission time. This is a challenging problem as each robot has to not only detect and avoid collisions from static obstacles in the environment but also has to model other robots' trajectories to avoid inter-robot collisions. We propose a novel deep reinforcement learning approach for multi-robot informative path planning to map targets-of-interest in an unknown 3D environment. A key aspect of our approach is an augmented graph that models other robots' trajectories to enable planning for communication and inter-robot collision avoidance. We train our decentralized reinforcement learning policy via the centralized training and decentralized execution paradigm. Once trained, our policy is also scalable to varying number of robots and does not require re-training. Our approach outperforms other state-of-the-art multi-robot target mapping approaches by 33.75% in terms of the number of discovered targets-of-interest. We open-source our code and model at: https://github.com/AccGen99/marl_ipp
翻译:自主机器人因其高效性和低劳动力成本,正被广泛应用于多种建图与数据采集任务。在此类任务中,机器人需在资源预算(如路径长度或任务时间)约束下,对未知环境中的兴趣目标进行建图。这是一个具有挑战性的问题,因为每个机器人不仅需要探测并规避环境中的静态障碍物,还需对其他机器人的轨迹进行建模以避免机器人间碰撞。本文提出了一种新颖的深度强化学习方法,用于多机器人信息路径规划,以在未知三维环境中对兴趣目标进行建图。我们方法的核心是构建一个增强图模型,该模型通过对其他机器人轨迹的建模,实现了通信规划与机器人间碰撞规避。我们通过集中训练与分散执行的范式,训练了分散式强化学习策略。训练完成后,该策略可灵活扩展至不同数量的机器人,且无需重新训练。在已发现的兴趣目标数量指标上,我们的方法以33.75%的优势超越了其他最先进的多机器人目标建图方法。我们在 https://github.com/AccGen99/marl_ipp 开源了代码与模型。