Autonomous driving requires a comprehensive understanding of the surrounding environment for reliable trajectory planning. Previous works rely on dense rasterized scene representation (e.g., agent occupancy and semantic map) to perform planning, which is computationally intensive and misses the instance-level structure information. In this paper, we propose VAD, an end-to-end vectorized paradigm for autonomous driving, which models the driving scene as a fully vectorized representation. The proposed vectorized paradigm has two significant advantages. On one hand, VAD exploits the vectorized agent motion and map elements as explicit instance-level planning constraints which effectively improves planning safety. On the other hand, VAD runs much faster than previous end-to-end planning methods by getting rid of computation-intensive rasterized representation and hand-designed post-processing steps. VAD achieves state-of-the-art end-to-end planning performance on the nuScenes dataset, outperforming the previous best method by a large margin. Our base model, VAD-Base, greatly reduces the average collision rate by 29.0% and runs 2.5x faster. Besides, a lightweight variant, VAD-Tiny, greatly improves the inference speed (up to 9.3x) while achieving comparable planning performance. We believe the excellent performance and the high efficiency of VAD are critical for the real-world deployment of an autonomous driving system. Code and models will be released for facilitating future research.
翻译:自动驾驶需要对周围环境进行全面理解以实现可靠的轨迹规划。现有方法依赖密集的栅格化场景表征(如智能体占用网格和语义地图)进行规划,这计算成本高昂且丢失了实例级结构信息。本文提出VAD——一种端到端的矢量化自动驾驶范式,将驾驶场景建模为完全矢量化的表征。所提出的矢量化范式具有两大显著优势:一方面,VAD利用矢量化的智能体运动与地图元素作为显式的实例级规划约束,有效提升了规划安全性;另一方面,VAD通过摒弃计算密集型的栅格化表征和人工设计的后处理步骤,运行速度远优于以往的端到端规划方法。VAD在nuScenes数据集上取得了最先进的端到端规划性能,大幅超越此前最佳方法。我们的基础模型VAD-Base将平均碰撞率降低29.0%,运行速度提升2.5倍。此外,轻量级变体VAD-Tiny在保持同等规划性能的同时显著提升推理速度(最高达9.3倍)。我们相信VAD的卓越性能与高实时性对自动驾驶系统的实际部署至关重要。代码与模型将开源以促进未来研究。