This paper presents a Pre-Training Deep Reinforcement Learning(DRL) for avoidance navigation without map for mobile robots which map raw sensor data to control variable and navigate in an unknown environment. The efficient offline training strategy is proposed to speed up the inefficient random explorations in early stage and we also collect a universal dataset including expert experience for offline training, which is of some significance for other navigation training work. The pre-training and prioritized expert experience are proposed to reduce 80\% training time and has been verified to improve the 2 times reward of DRL. The advanced simulation gazebo with real physical modelling and dynamic equations reduce the gap between sim-to-real. We train our model a corridor environment, and evaluate the model in different environment getting the same effect. Compared to traditional method navigation, we can confirm the trained model can be directly applied into different scenarios and have the ability to no collision navigate. It was demonstrated that our DRL model have universal general capacity in different environment.
翻译:本文提出了一种基于预训练深度强化学习(DRL)的无地图避障导航方法,该方法可使移动机器人将原始传感器数据映射为控制变量,并在未知环境中完成导航。本文设计了高效的离线训练策略,以加速早期阶段低效的随机探索过程,并构建了一个包含专家经验的通用数据集用于离线训练,该数据集对其它导航训练工作具有一定参考价值。通过采用预训练机制与优先专家经验方法,训练时间减少了80%,且验证表明DRL的奖励值提升了2倍。基于真实物理建模与动力学方程的高级仿真环境Gazebo缩小了仿真与现实的差距。我们在走廊环境中训练模型,并在不同环境下进行评估,取得了相同的效果。与传统导航方法相比,可以确认训练后的模型可直接应用于不同场景,并具备无碰撞导航能力。实验结果证明,我们的DRL模型在不同环境中具有通用泛化能力。