End-to-end autonomous driving unifies tasks in a differentiable framework, enabling planning-oriented optimization and attracting growing attention. Current methods aggregate historical information either through dense historical bird's-eye-view (BEV) features or by querying a sparse memory bank, following paradigms inherited from detection. However, we argue that these paradigms either omit historical information in motion planning or fail to align with its multi-step nature, which requires predicting or planning multiple future time steps. In line with the philosophy of future is a continuation of past, we propose BridgeAD, which reformulates motion and planning queries as multi-step queries to differentiate the queries for each future time step. This design enables the effective use of historical prediction and planning by applying them to the appropriate parts of the end-to-end system based on the time steps, which improves both perception and motion planning. Specifically, historical queries for the current frame are combined with perception, while queries for future frames are integrated with motion planning. In this way, we bridge the gap between past and future by aggregating historical insights at every time step, enhancing the overall coherence and accuracy of the end-to-end autonomous driving pipeline. Extensive experiments on the nuScenes dataset in both open-loop and closed-loop settings demonstrate that BridgeAD achieves state-of-the-art performance.
翻译:端到端自动驾驶将各项任务统一于可微分框架中,实现了面向规划的系统优化,正受到日益广泛的关注。现有方法通过密集的历史鸟瞰图特征或查询稀疏记忆库来聚合历史信息,这些范式沿袭了检测任务的思路。然而,我们认为这些范式要么在运动规划中忽略了历史信息,要么未能契合其多步特性——即需要预测或规划多个未来时间步。遵循"未来是过去的延续"这一理念,我们提出BridgeAD模型,将运动与规划查询重构为多步查询,以区分针对每个未来时间步的查询。该设计通过将历史预测与规划应用于端到端系统中对应时间步的模块,实现了历史信息的有效利用,从而同步提升了感知与运动规划性能。具体而言,当前帧的历史查询与感知模块结合,而未来帧的查询则与运动规划模块集成。通过在每个时间步聚合历史洞察,我们弥合了过去与未来的鸿沟,增强了端到端自动驾驶流程的整体连贯性与准确性。在nuScenes数据集上开展的大量开环与闭环实验表明,BridgeAD实现了最先进的性能。