End-to-end autonomous driving recently emerged as a promising research direction to target autonomy from a full-stack perspective. Along this line, many of the latest works follow an open-loop evaluation setting on nuScenes to study the planning behavior. In this paper, we delve deeper into the problem by conducting thorough analyses and demystifying more devils in the details. We initially observed that the nuScenes dataset, characterized by relatively simple driving scenarios, leads to an under-utilization of perception information in end-to-end models incorporating ego status, such as the ego vehicle's velocity. These models tend to rely predominantly on the ego vehicle's status for future path planning. Beyond the limitations of the dataset, we also note that current metrics do not comprehensively assess the planning quality, leading to potentially biased conclusions drawn from existing benchmarks. To address this issue, we introduce a new metric to evaluate whether the predicted trajectories adhere to the road. We further propose a simple baseline able to achieve competitive results without relying on perception annotations. Given the current limitations on the benchmark and metrics, we suggest the community reassess relevant prevailing research and be cautious whether the continued pursuit of state-of-the-art would yield convincing and universal conclusions. Code and models are available at \url{https://github.com/NVlabs/BEV-Planner}
翻译:端到端自动驾驶作为从全栈视角实现自主性的研究方向,近期展现出广阔前景。在这一方向上,许多最新研究采用nuScenes数据集的开环评估设置来探索规划行为。本文通过深入分析,揭示了更多隐藏于细节中的关键问题。我们首先发现,以相对简单驾驶场景为特征的nuScenes数据集,导致融合自车状态(如本车速度)的端到端模型对感知信息的利用不足。这些模型往往主要依赖自车状态进行未来路径规划。除数据集局限外,我们还注意到当前评估指标未能全面衡量规划质量,可能导致现有基准测试得出有偏结论。为解决该问题,我们提出了一种新指标来评估预测轨迹是否符合道路约束。进一步地,我们设计了一个无需依赖感知标注的简易基线模型,其性能可与现有方法竞争。鉴于当前基准测试与评估指标的局限性,我们建议学界重新审视相关主流研究方向,并审慎思考持续追求最优性能是否能够得出令人信服且具有普适性的结论。代码与模型已发布于\url{https://github.com/NVlabs/BEV-Planner}。