Modern autonomous driving system is characterized as modular tasks in sequential order, i.e., perception, prediction, and planning. In order to perform a wide diversity of tasks and achieve advanced-level intelligence, contemporary approaches either deploy standalone models for individual tasks, or design a multi-task paradigm with separate heads. However, they might suffer from accumulative errors or deficient task coordination. Instead, we argue that a favorable framework should be devised and optimized in pursuit of the ultimate goal, i.e., planning of the self-driving car. Oriented at this, we revisit the key components within perception and prediction, and prioritize the tasks such that all these tasks contribute to planning. We introduce Unified Autonomous Driving (UniAD), a comprehensive framework up-to-date that incorporates full-stack driving tasks in one network. It is exquisitely devised to leverage advantages of each module, and provide complementary feature abstractions for agent interaction from a global perspective. Tasks are communicated with unified query interfaces to facilitate each other toward planning. We instantiate UniAD on the challenging nuScenes benchmark. With extensive ablations, the effectiveness of using such a philosophy is proven by substantially outperforming previous state-of-the-arts in all aspects. Code and models are public.
翻译:现代自动驾驶系统通常按顺序划分为模块化任务,即感知、预测和规划。为了执行多样化的任务并实现高级智能,当前的方法要么为单个任务部署独立模型,要么设计具有独立头部的多任务范式。然而,这些方法可能受到累积误差或任务协调不足的影响。相反,我们认为应该以最终目标——即自动驾驶汽车的规划为导向,设计和优化一个有利的框架。为此,我们重新审视了感知和预测中的关键组件,并对任务进行优先级排序,使得所有这些任务都服务于规划。我们引入了统一自动驾驶(UniAD)——一个最新的全面框架,将全栈驾驶任务整合到一个网络中。该框架精妙设计,充分利用每个模块的优势,并从全局视角提供互补的特征抽象以支持智能体交互。任务通过统一的查询接口进行通信,相互促进以实现规划。我们在具有挑战性的nuScenes基准上实例化UniAD。通过大量消融实验,这种理念的有效性得到了证明,其在各个方面显著超越了先前的最优方法。代码和模型已公开。