In autonomous driving, end-to-end methods utilizing Imitation Learning (IL) and Reinforcement Learning (RL) are becoming more and more common. However, they do not involve explicit reasoning like classic robotics workflow and planning with horizons, resulting in strategies implicit and myopic. In this paper, we introduce a path planning method that uses Behavioral Cloning (BC) for path-tracking and Proximal Policy Optimization (PPO) for static obstacle nudging. It outputs lateral offset values to adjust the given reference waypoints and performs modified path for different controllers. Experimental results show that the algorithm can do path following that mimics the expert performance of path-tracking controllers, and avoid collision to fixed obstacles. The method makes a good attempt at planning with learning-based methods in path planning problems of autonomous driving.
翻译:在自动驾驶领域,利用模仿学习(IL)与强化学习(RL)的端到端方法正变得越来越普遍。然而,这些方法缺乏经典机器人工作流程中的显式推理机制以及基于预测时域的规划能力,导致其策略具有隐含性与短视性。本文提出一种路径规划方法,该方法采用行为克隆(BC)进行路径跟踪,并利用近端策略优化(PPO)实现静态障碍物的轻推避让。该方法输出横向偏移值以调整给定的参考路径点,从而为不同的控制器生成修正后的行驶路径。实验结果表明,该算法能够实现与专家级路径跟踪控制器性能相当的路径跟随效果,并能有效避让固定障碍物。本方法为在自动驾驶路径规划问题中应用基于学习的规划方法做出了有益的尝试。