Automated parking stands as a highly anticipated application of autonomous driving technology. However, existing path planning methodologies fall short of addressing this need due to their incapability to handle the diverse and complex parking scenarios in reality. While non-learning methods provide reliable planning results, they are vulnerable to intricate occasions, whereas learning-based ones are good at exploration but unstable in converging to feasible solutions. To leverage the strengths of both approaches, we introduce Hybrid pOlicy Path plannEr (HOPE). This novel solution integrates a reinforcement learning agent with Reeds-Shepp curves, enabling effective planning across diverse scenarios. HOPE guides the exploration of the reinforcement learning agent by applying an action mask mechanism and employs a transformer to integrate the perceived environmental information with the mask. To facilitate the training and evaluation of the proposed planner, we propose a criterion for categorizing the difficulty level of parking scenarios based on space and obstacle distribution. Experimental results demonstrate that our approach outperforms typical rule-based algorithms and traditional reinforcement learning methods, showing higher planning success rates and generalization across various scenarios. We also conduct real-world experiments to verify the practicability of HOPE. The code for our solution will be openly available on \href{GitHub}{https://github.com/jiamiya/HOPE}.
翻译:自动泊车是自动驾驶技术中备受期待的应用。然而,现有的路径规划方法由于无法处理现实中多样且复杂的泊车场景,尚不能满足这一需求。非学习方法虽能提供可靠的规划结果,但在复杂场景下表现脆弱;而基于学习的方法擅长探索,却在收敛至可行解时不够稳定。为结合两种方法的优势,我们提出了混合策略路径规划器(Hybrid pOlicy Path plannEr, HOPE)。这一新颖方案将强化学习智能体与Reeds-Shepp曲线相结合,实现了跨多样化场景的有效规划。HOPE通过动作掩码机制引导强化学习智能体的探索,并采用transformer网络将感知的环境信息与掩码进行融合。为促进所提出规划器的训练与评估,我们基于空间与障碍物分布提出了泊车场景难度分级标准。实验结果表明,我们的方法优于典型的基于规则的算法和传统强化学习方法,在不同场景中展现出更高的规划成功率和泛化能力。我们还进行了实车实验以验证HOPE的实用性。本方案的代码将在\href{GitHub}{https://github.com/jiamiya/HOPE}公开提供。