Deep reinforcement learning (DRL) methods have recently shown promise in path planning tasks. However, when dealing with global planning tasks, these methods face serious challenges such as poor convergence and generalization. To this end, we propose an attention-enhanced DRL method called LOPA (Learn Once Plan Arbitrarily) in this paper. Firstly, we analyze the reasons of these problems from the perspective of DRL's observation, revealing that the traditional design causes DRL to be interfered by irrelevant map information. Secondly, we develop the LOPA which utilizes a novel attention-enhanced mechanism to attain an improved attention capability towards the key information of the observation. Such a mechanism is realized by two steps: (1) an attention model is built to transform the DRL's observation into two dynamic views: local and global, significantly guiding the LOPA to focus on the key information on the given maps; (2) a dual-channel network is constructed to process these two views and integrate them to attain an improved reasoning capability. The LOPA is validated via multi-objective global path planning experiments. The result suggests the LOPA has improved convergence and generalization performance as well as great path planning efficiency.
翻译:深度强化学习(DRL)方法近年来在路径规划任务中展现出潜力。然而,在处理全局规划任务时,这些方法面临收敛性和泛化性差等严峻挑战。为此,本文提出一种注意力增强的DRL方法——LOPA(学会一次规划任意路径)。首先,我们从DRL观测的角度分析了这些问题的成因,揭示了传统设计导致DRL受到无关地图信息干扰的现象。其次,我们开发了LOPA,该方法利用一种新颖的注意力增强机制,提升对观测关键信息的关注能力。该机制通过两个步骤实现:(1)构建注意力模型,将DRL观测转化为局部和全局两种动态视图,有效引导LOPA聚焦于给定地图上的关键信息;(2)构建双通道网络处理这两种视图并将其整合,以增强推理能力。通过多目标全局路径规划实验验证了LOPA的有效性。结果表明,LOPA具有改进的收敛性和泛化性能,同时具备出色的路径规划效率。