In this paper, how to efficiently find the optimal path in complex warehouse layout and make real-time decision is a key problem. This paper proposes a new method of Proximal Policy Optimization (PPO) and Dijkstra's algorithm, Proximal policy-Dijkstra (PP-D). PP-D method realizes efficient strategy learning and real-time decision making through PPO, and uses Dijkstra algorithm to plan the global optimal path, thus ensuring high navigation accuracy and significantly improving the efficiency of path planning. Specifically, PPO enables robots to quickly adapt and optimize action strategies in dynamic environments through its stable policy updating mechanism. Dijkstra's algorithm ensures global optimal path planning in static environment. Finally, through the comparison experiment and analysis of the proposed framework with the traditional algorithm, the results show that the PP-D method has significant advantages in improving the accuracy of navigation prediction and enhancing the robustness of the system. Especially in complex warehouse layout, PP-D method can find the optimal path more accurately and reduce collision and stagnation. This proves the reliability and effectiveness of the robot in the study of complex warehouse layout navigation algorithm.
翻译:本文针对复杂仓库布局中如何高效寻找最优路径并实现实时决策这一关键问题,提出了一种结合近端策略优化(PPO)与Dijkstra算法的新方法——近端策略-迪杰斯特拉(PP-D)。PP-D方法通过PPO实现高效的策略学习与实时决策,并利用Dijkstra算法规划全局最优路径,从而保证了较高的导航精度,并显著提升了路径规划效率。具体而言,PPO凭借其稳定的策略更新机制,使机器人能够在动态环境中快速适应并优化动作策略;Dijkstra算法则在静态环境下确保全局最优路径规划。最后,通过将所提框架与传统算法进行对比实验与分析,结果表明PP-D方法在提升导航预测精度和增强系统鲁棒性方面具有显著优势。尤其在复杂仓库布局中,PP-D方法能够更精确地找到最优路径,减少碰撞与停滞现象。这证明了该算法在复杂仓库布局导航研究中的可靠性与有效性。