Safe maneuvering capability is critical for mobile robots in complex environments. However, robotic system dynamics are often time-varying, uncertain, or even unknown during the motion planning and control process. Therefore, many existing model-based reinforcement learning (RL) methods could not achieve satisfactory reliability in guaranteeing safety. To address this challenge, we propose a two-level Vector Field-guided Learning Predictive Control (VF-LPC) approach that guarantees safe maneuverability. The first level, the guiding level, generates safe desired trajectories using the designed kinodynamic guiding vector field, enabling safe motion in obstacle-dense environments. The second level, the Integrated Motion Planning and Control (IMPC) level, first uses the deep Koopman operator to learn a nominal dynamics model offline and then updates the model uncertainties online using sparse Gaussian processes (GPs). The learned dynamics and game-based safe barrier function are then incorporated into the learning predictive control framework to generate near-optimal control sequences. We conducted tests to compare the performance of VF-LPC with existing advanced planning methods in an obstacle-dense environment. The simulation results show that it can generate feasible trajectories quickly. Then, VF-LPC is evaluated against motion planning methods that employ model predictive control (MPC) and RL in high-fidelity CarSim software. The results show that VF-LPC outperforms them under metrics of completion time, route length, and average solution time. We also carried out path-tracking control tests on a racing road to validate the model uncertainties learning capability. Finally, we conducted real-world experiments on a Hongqi E-HS3 vehicle, further validating the VF-LPC approach's effectiveness.
翻译:安全机动能力对于复杂环境中的移动机器人至关重要。然而,在运动规划与控制过程中,机器人系统动力学常呈现时变、不确定甚至未知特性。因此,许多现有的基于模型的强化学习方法在保障安全性方面难以达到令人满意的可靠性。为解决这一挑战,我们提出一种双层矢量场引导的学习预测控制方法,可保证安全机动能力。第一层为引导层,通过设计的运动学引导矢量场生成安全期望轨迹,实现障碍密集环境中的安全运动。第二层为集成运动规划与控制层,首先利用深度库普曼算子离线学习标称动力学模型,然后采用稀疏高斯过程在线更新模型不确定性。将学习到的动力学与基于博弈的安全屏障函数融入学习预测控制框架,生成近最优控制序列。我们在障碍密集环境中进行了测试,比较了VF-LPC方法与现有先进规划方法的性能。仿真结果表明,该方法能快速生成可行轨迹。随后,在高保真CarSim软件中,将VF-LPC与采用模型预测控制和强化学习的运动规划方法进行对比评估。结果显示,在完成时间、路径长度和平均求解时间等指标上,VF-LPC均优于对比方法。我们还开展了赛道路径跟踪控制测试,以验证模型不确定性学习能力。最后,在红旗E-HS3实车上进行了真实环境实验,进一步验证了VF-LPC方法的有效性。