Predicting the future states of surrounding traffic participants and planning a safe, smooth, and socially compliant trajectory accordingly is crucial for autonomous vehicles. There are two major issues with the current autonomous driving system: the prediction module is often separated from the planning module and the cost function for planning is hard to specify and tune. To tackle these issues, we propose a differentiable integrated prediction-planning framework (DIPP) that can also learn the cost function from data. Specifically, our framework uses a differentiable nonlinear optimizer as the motion planner, which takes as input the predicted trajectories of surrounding agents given by the neural network and optimizes the trajectory for the autonomous vehicle, enabling all operations to be differentiable, including the cost function weights. The proposed framework is trained on a large-scale real-world driving dataset to imitate human driving trajectories in the entire driving scene and validated in both open-loop and closed-loop manners. The open-loop testing results reveal that the proposed method outperforms the baseline methods across a variety of metrics and delivers planning-centric prediction results, allowing the planning module to output trajectories close to those of human drivers. In closed-loop testing, the proposed method outperforms various baseline methods, showing the ability to handle complex urban driving scenarios and robustness against the distributional shift. Importantly, we find that joint training of planning and prediction modules achieves better performance than planning with a separate trained prediction module in both open-loop and closed-loop tests. Moreover, the ablation study indicates that the learnable components in the framework are essential to ensure planning stability and performance.
翻译:预测周围交通参与者的未来状态并据此规划安全、平滑且符合社会规范的轨迹是自动驾驶车辆的关键需求。当前自动驾驶系统面临两大主要问题:预测模块常与规划模块分离,且规划的代价函数难以制定和调参。针对这些问题,我们提出一种可微分集成式预测-规划框架(DIPP),该框架能够从数据中学习代价函数。具体而言,我们的框架采用可微分非线性优化器作为运动规划器,将神经网络给出的周围智能体预测轨迹作为输入,为自动驾驶车辆优化轨迹,从而使所有操作(包括代价函数权重)均具有可微性。该框架在大规模真实驾驶数据集上训练,以模仿整个驾驶场景中的人类驾驶轨迹,并通过开环与闭环两种方式验证。开环测试结果表明,所提方法在多种评估指标上优于基线方法,并生成以规划为中心的预测结果,使规划模块输出接近人类驾驶员的轨迹。闭环测试中,所提方法同样优于各类基线方法,展现了处理复杂城市驾驶场景的能力及对分布偏移的鲁棒性。值得注意的是,我们发现联合训练规划与预测模块在开环和闭环测试中的表现均优于使用独立训练的预测模块进行规划。此外,消融研究表明,框架中的可学习组件对确保规划稳定性与性能至关重要。