Thanks to its robust learning and search stabilities,the reinforcement learning (RL) algorithm has garnered increasingly significant attention and been exten-sively applied in Automated Guided Vehicle (AGV) path planning. However, RL-based planning algorithms have been discovered to suffer from the substantial variance of neural networks caused by environmental instability and significant fluctua-tions in system structure. These challenges manifest in slow convergence speed and low learning efficiency. To tackle this issue, this paper presents a novel multi-AGV path planning method named Particle Filters - Double Deep Q-Network (PF-DDQN)via leveraging Particle Filters (PF) and RL algorithm. Firstly, the proposed method leverages the imprecise weight values of the network as state values to formulate thestate space equation.Subsequently, the DDQN model is optimized to acquire the optimal true weight values through the iterative fusion process of neural networksand PF in order to enhance the optimization efficiency of the proposedmethod. Lastly, the performance of the proposed method is validated by different numerical simulations. The simulation results demonstrate that the proposed methoddominates the traditional DDQN algorithm in terms of path planning superiority andtraining time indicator by 92.62% and 76.88%, respectively. Therefore, the proposedmethod could be considered as a vital alternative in the field of multi-AGV path planning.
翻译:得益于其稳健的学习与搜索稳定性,强化学习(RL)算法在自动导引车(AGV)路径规划领域受到日益广泛的关注与应用。然而,研究发现基于强化学习的规划算法存在因环境不稳定及系统结构大幅波动所导致的神经网络权重方差过大的问题,这些挑战表现为收敛速度缓慢与学习效率低下。为应对此问题,本文提出一种融合粒子滤波(PF)与强化学习算法的新型多AGV路径规划方法——粒子滤波双深度Q网络(PF-DDQN)。首先,该方法将网络的不精确权重值作为状态量构建状态空间方程;随后,通过神经网络与粒子滤波的迭代融合过程优化DDQN模型以获取最优真实权重值,从而提升所提方法的优化效率;最后,通过多组数值仿真验证了所提方法的性能。仿真结果表明,所提方法在路径规划优越性与训练时间指标上分别较传统DDQN算法提升92.62%与76.88%。因此,该方法可视为多AGV路径规划领域的重要替代方案。