ARiADNE: A Reinforcement learning approach using Attention-based Deep Networks for Exploration

from arxiv, \c{opyright} 20XX IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

In autonomous robot exploration tasks, a mobile robot needs to actively explore and map an unknown environment as fast as possible. Since the environment is being revealed during exploration, the robot needs to frequently re-plan its path online, as new information is acquired by onboard sensors and used to update its partial map. While state-of-the-art exploration planners are frontier- and sampling-based, encouraged by the recent development in deep reinforcement learning (DRL), we propose ARiADNE, an attention-based neural approach to obtain real-time, non-myopic path planning for autonomous exploration. ARiADNE is able to learn dependencies at multiple spatial scales between areas of the agent's partial map, and implicitly predict potential gains associated with exploring those areas. This allows the agent to sequence movement actions that balance the natural trade-off between exploitation/refinement of the map in known areas and exploration of new areas. We experimentally demonstrate that our method outperforms both learning and non-learning state-of-the-art baselines in terms of average trajectory length to complete exploration in hundreds of simplified 2D indoor scenarios. We further validate our approach in high-fidelity Robot Operating System (ROS) simulations, where we consider a real sensor model and a realistic low-level motion controller, toward deployment on real robots.

翻译：在自主机器人探索任务中，移动机器人需要尽可能快速地主动探索并绘制未知环境的地图。由于环境在探索过程中逐步显现，机器人需根据机载传感器获取的新信息在线频繁重新规划路径，以更新其局部地图。尽管最先进的探索规划器主要基于前沿点与采样方法，受深度强化学习（DRL）最新进展的启发，我们提出了ARiADNE——一种基于注意力机制的神经方法，用于实现自主探索中的实时非近视路径规划。ARiADNE能够学习智能体局部地图中不同空间尺度区域之间的依赖关系，并隐式预测探索这些区域的潜在收益。这使得智能体能够排序移动动作，从而在已知区域的利用/地图细化与未知区域的探索之间平衡自然存在的权衡。我们通过实验证明，在数百个简化的二维室内场景中，我们的方法在完成探索的平均轨迹长度上优于基于学习与非学习的最先进基线模型。我们进一步在高保真度的机器人操作系统（ROS）仿真中验证了该方法，其中考虑了真实传感器模型和实际低层运动控制器，旨在最终部署于真实机器人。