Coverage path planning (CPP) is the problem of finding a path that covers the entire free space of a confined area, with applications ranging from robotic lawn mowing to search-and-rescue. When the environment is unknown, the path needs to be planned online while mapping the environment, which cannot be addressed by offline planning methods that do not allow for a flexible path space. We investigate how suitable reinforcement learning is for this challenging problem, and analyze the involved components required to efficiently learn coverage paths, such as action space, input feature representation, neural network architecture, and reward function. We propose a computationally feasible egocentric map representation based on frontiers, and a novel reward term based on total variation to promote complete coverage. Through extensive experiments, we show that our approach surpasses the performance of both previous RL-based approaches and highly specialized methods across multiple CPP variations.
翻译:覆盖路径规划(CPP)是指在限定区域内寻找一条覆盖整个自由空间的路径,其应用涵盖从机器人割草到搜索救援等多个领域。当环境未知时,路径规划需在绘制环境地图的同时在线进行,这无法通过不允许灵活路径空间的离线规划方法解决。我们研究了强化学习对这一挑战性问题的适用性,并分析了高效学习覆盖路径所需的相关组件,包括动作空间、输入特征表示、神经网络架构和奖励函数。我们提出了一种基于前沿的、计算可行的自我中心地图表示方法,以及一种基于总变分的新型奖励项以促进完全覆盖。通过大量实验,我们证明我们的方法在多种CPP变体中的表现均优于以往基于强化学习的方法以及高度专业化的方法。