Motion planning has been an important research topic in achieving safe and flexible maneuvers for intelligent vehicles. However, it remains challenging to realize efficient and optimal planning in the presence of uncertain model dynamics. In this paper, a sparse kernel-based reinforcement learning (RL) algorithm with Gaussian Process (GP) Regression (called GP-SKRL) is proposed to achieve online adaption and near-optimal motion planning performance. In this algorithm, we design an efficient sparse GP regression method to learn the uncertain dynamics. Based on the updated model, a sparse kernel-based policy iteration algorithm with an exponential barrier function is designed to learn the near-optimal planning policies with the capability to avoid dynamic obstacles. Thereby, batch-mode GP-SKRL with online adaption capability can estimate the changing system dynamics. The converged RL policies are then deployed on vehicles efficiently under a safety-aware module. As a result, the produced driving actions are safe and less conservative, and the planning performance has been noticeably improved. Extensive simulation results show that GP-SKRL outperforms several advanced motion planning methods in terms of average cumulative cost, trajectory length, and task completion time. In particular, experiments on a Hongqi E-HS3 vehicle demonstrate that superior GP-SKRL provides a practical planning solution.
翻译:运动规划一直是实现智能车辆安全灵活机动的重要研究课题。然而,在存在不确定模型动力学的情况下,实现高效且最优的规划仍然具有挑战性。本文提出了一种基于稀疏核的高斯过程回归强化学习算法(称为GP-SKRL),以实现在线自适应和近优运动规划性能。在该算法中,我们设计了一种高效的稀疏高斯过程回归方法来学习不确定动力学。基于更新后的模型,设计了一种结合指数势垒函数的稀疏核策略迭代算法,以学习具有避障能力的近优规划策略。因此,具有在线自适应能力的批量模式GP-SKRL能够估计变化的系统动力学。收敛后的强化学习策略在安全感知模块下高效部署到车辆上。由此产生的驾驶动作安全且不保守,规划性能显著提升。大量仿真结果表明,GP-SKRL在平均累积成本、轨迹长度和任务完成时间方面优于几种先进的运动规划方法。特别是,红旗E-HS3实车实验证明GP-SKRL提供了一种实用的规划解决方案。