Autonomous navigation of Unmanned Surface Vehicles (USV) in marine environments with current flows is challenging, and few prior works have addressed the sensorbased navigation problem in such environments under no prior knowledge of the current flow and obstacles. We propose a Distributional Reinforcement Learning (RL) based local path planner that learns return distributions which capture the uncertainty of action outcomes, and an adaptive algorithm that automatically tunes the level of sensitivity to the risk in the environment. The proposed planner achieves a more stable learning performance and converges to safer policies than a traditional RL based planner. Computational experiments demonstrate that comparing to a traditional RL based planner and classical local planning methods such as Artificial Potential Fields and the Bug Algorithm, the proposed planner is robust against environmental flows, and is able to plan trajectories that are superior in safety, time and energy consumption.
翻译:在具有水流影响的海洋环境中实现无人水面艇(USV)的自主导航颇具挑战性,且现有研究鲜少涉及在无先验水流与障碍物信息条件下基于传感器的导航问题。我们提出一种基于分布式强化学习(RL)的局部路径规划器,该规划器通过学习能够捕获动作结果不确定性的回报分布,并配备自适应算法以动态调整对环境风险的敏感程度。与传统RL规划器相比,所提规划器实现了更稳定的学习性能,并收敛至更安全的策略。计算实验表明,相较于传统RL规划器及人工势场法、Bug算法等经典局部规划方法,本规划器对环境水流具有鲁棒性,且能规划出在安全性、时间及能耗方面均具优越性的轨迹。