Path planning in high-dimensional spaces poses significant challenges, particularly in achieving both time efficiency and a fair success rate. To address these issues, we introduce a novel path-planning algorithm, Zonal RL-RRT, that leverages kd-tree partitioning to segment the map into zones while addressing zone connectivity, ensuring seamless transitions between zones. By breaking down the complex environment into multiple zones and using Q-learning as the high-level decision-maker, our algorithm achieves a 3x improvement in time efficiency compared to basic sampling methods such as RRT and RRT* in forest-like maps. Our approach outperforms heuristic-guided methods like BIT* and Informed RRT* by 1.5x in terms of runtime while maintaining robust and reliable success rates across 2D to 6D environments. Compared to learning-based methods like NeuralRRT* and MPNetSMP, as well as the heuristic RRT*J, our algorithm demonstrates, on average, 1.5x better performance in the same environments. We also evaluate the effectiveness of our approach through simulations of the UR10e arm manipulator in the MuJoCo environment. A key observation of our approach lies in its use of zone partitioning and Reinforcement Learning (RL) for adaptive high-level planning allowing the algorithm to accommodate flexible policies across diverse environments, making it a versatile tool for advanced path planning.
翻译:高维空间中的路径规划面临重大挑战,尤其在同时实现时间效率与较高成功率方面。为解决这些问题,我们提出了一种新颖的路径规划算法——Zonal RL-RRT,该算法利用kd树分区将地图划分为多个区域,同时处理区域连通性问题,确保区域间的无缝过渡。通过将复杂环境分解为多个区域,并采用Q学习作为高层决策器,我们的算法在森林状地图中相比RRT和RRT*等基本采样方法实现了3倍的时间效率提升。在二维至六维环境中,本方法在保持稳健可靠成功率的同时,其运行时间较BIT*和Informed RRT*等启发式引导方法缩短1.5倍。与NeuralRRT*、MPNetSMP等基于学习的方法以及启发式RRT*J相比,我们的算法在相同环境中平均性能提升1.5倍。我们还在MuJoCo环境中通过UR10e机械臂仿真评估了该方法的有效性。本方法的核心优势在于利用区域划分与强化学习实现自适应高层规划,使算法能够适应不同环境的灵活策略,从而成为高级路径规划的多功能工具。