Autonomous navigation in unknown environments without a global map is a long-standing challenge for mobile robots. While deep reinforcement learning (DRL) has attracted a rapidly growing interest in solving such an autonomous navigation problem for its generalization capability, DRL typically leads to a mediocre navigation performance in practice due to the gap between the training scene and the actual test scene. Most existing work focuses on tuning the algorithm to enhance its transferability, whereas few investigates how to quantify or measure the gap therebetween. This letter presents a local map-based deep Q-network (DQN) navigation algorithm, which uses local maps converted from 2D LiDAR data as observations without a global map. More importantly, this letter proposes a new transferability metric -- the scene similarity calculated from an improved image template matching algorithm to measure the similarity between the training and test scenes. With a wheeled robot as the case study platform, both simulation and real-world experiments are conducted in a total of 20 different scenes. The case study results successfully validate the local map-based navigation algorithm as well as the similarity metric in predicting the transferability or success rate of the algorithm.
翻译:在无全局地图的未知环境中实现自主导航一直是移动机器人面临的长期挑战。深度强化学习(DRL)因其泛化能力在解决此类自主导航问题上日益受到关注,但由于训练场景与实际测试场景之间的差异,DRL在实际应用中往往只能获得平庸的导航性能。现有研究大多聚焦于算法调优以提升其可迁移性,而鲜有研究探讨如何量化或度量这一差异。本文提出一种基于局部地图的深度Q网络(DQN)导航算法,该算法将二维激光雷达数据转换生成的局部地图作为观测输入,无需依赖全局地图。更重要的是,本文提出一种新的可迁移性度量指标——通过改进的图像模板匹配算法计算场景相似度,用以衡量训练场景与测试场景之间的相似性。以轮式机器人为案例研究平台,在总计20个不同场景中开展了仿真与真实世界实验。案例研究结果成功验证了基于局部地图的导航算法以及基于相似度度量来预测算法可迁移性或成功率的有效性。