Autonomous driving requires a comprehensive understanding of the surrounding environment for reliable trajectory planning. Previous works rely on dense rasterized scene representation (e.g., agent occupancy and semantic map) to perform planning, which is computationally intensive and misses the instance-level structure information. In this paper, we propose VAD, an end-to-end vectorized paradigm for autonomous driving, which models the driving scene as fully vectorized representation. The proposed vectorized paradigm has two significant advantages. On one hand, VAD exploits the vectorized agent motion and map elements as explicit instance-level planning constraints which effectively improves planning safety. On the other hand, VAD runs much faster than previous end-to-end planning methods by getting rid of computation-intensive rasterized representation and hand-designed post-processing steps. VAD achieves state-of-the-art end-to-end planning performance on the nuScenes dataset, outperforming the previous best method by a large margin (reducing the average collision rate by 48.4%). Besides, VAD greatly improves the inference speed (up to 9.3x), which is critical for the real-world deployment of an autonomous driving system. Code and models will be released for facilitating future research.
翻译:自动驾驶需要全面理解周围环境以实现可靠的轨迹规划。现有方法依赖于密集的栅格化场景表示(如智能体占用网格和语义地图)进行规划,这不仅计算开销大,而且丢失了实例级结构信息。本文提出VAD——一种用于自动驾驶的端到端矢量化范式,将驾驶场景建模为完全矢量化表示。该矢量化范式具有两大显著优势:一方面,VAD利用矢量化智能体运动与地图元素作为显式的实例级规划约束,有效提升规划安全性;另一方面,通过摒弃计算密集型的栅格化表示和人工设计的后处理步骤,VAD的运行速度远超此前端到端规划方法。在nuScenes数据集上,VAD达到了最先进的端到端规划性能,平均碰撞率较此前最优方法降低48.4%,大幅超越现有方案。此外,VAD显著提升了推理速度(最高达9.3倍),这对自动驾驶系统的实际部署至关重要。相关代码与模型将公开发布以促进未来研究。