End-to-End paradigms use a unified framework to implement multi-tasks in an autonomous driving system. Despite simplicity and clarity, the performance of end-to-end autonomous driving methods on sub-tasks is still far behind the single-task methods. Meanwhile, the widely used dense BEV features in previous end-to-end methods make it costly to extend to more modalities or tasks. In this paper, we propose a Sparse query-centric paradigm for end-to-end Autonomous Driving (SparseAD), where the sparse queries completely represent the whole driving scenario across space, time and tasks without any dense BEV representation. Concretely, we design a unified sparse architecture for perception tasks including detection, tracking, and online mapping. Moreover, we revisit motion prediction and planning, and devise a more justifiable motion planner framework. On the challenging nuScenes dataset, SparseAD achieves SOTA full-task performance among end-to-end methods and significantly narrows the performance gap between end-to-end paradigms and single-task methods. Codes will be released soon.
翻译:端到端范式采用统一框架在自动驾驶系统中实现多任务。尽管具有简洁性和清晰性,但端到端自动驾驶方法在子任务上的性能仍远落后于单任务方法。同时,先前端到端方法中广泛使用的密集BEV特征使其向更多模态或任务的扩展代价高昂。本文提出一种用于端到端自动驾驶的稀疏查询中心范式(SparseAD),其中稀疏查询无需任何密集BEV表示即可完整表征整个驾驶场景的时空维度与任务维度。具体而言,我们为感知任务(包括检测、跟踪和在线建图)设计统一稀疏架构。此外,我们重新审视运动预测与规划,并设计了更合理的运动规划器框架。在具有挑战性的nuScenes数据集的完整任务评估中,SparseAD在端到端方法中达到了最先进的性能,并显著缩小了端到端范式与单任务方法之间的性能差距。代码即将开源。