EgoFSD：面向高效端到端自动驾驶的以自我为中心的全稀疏范式——基于不确定性去噪与迭代优化 (EgoFSD: Ego-Centric Fully Sparse Paradigm with Uncertainty Denoising and Iterative Refinement for Efficient End-to-End Self-Driving)

Current End-to-End Autonomous Driving (E2E-AD) methods resort to unifying modular designs for various tasks (e.g. perception, prediction and planning). Although optimized with a fully differentiable framework in a planning-oriented manner, existing end-to-end driving systems lacking ego-centric designs still suffer from unsatisfactory performance and inferior efficiency, due to rasterized scene representation learning and redundant information transmission. In this paper, we propose an ego-centric fully sparse paradigm, named EgoFSD, for end-to-end self-driving. Specifically, EgoFSD consists of sparse perception, hierarchical interaction and iterative motion planner. The sparse perception module performs detection and online mapping based on sparse representation of the driving scene. The hierarchical interaction module aims to select the Closest In-Path Vehicle / Stationary (CIPV / CIPS) from coarse to fine, benefiting from an additional geometric prior. As for the iterative motion planner, both selected interactive agents and ego-vehicle are considered for joint motion prediction, where the output multi-modal ego-trajectories are optimized in an iterative fashion. In addition, position-level motion diffusion and trajectory-level planning denoising are introduced for uncertainty modeling, thereby enhancing the training stability and convergence speed. Extensive experiments are conducted on nuScenes and Bench2Drive datasets, which significantly reduces the average L2 error by 59% and collision rate by 92% than UniAD while achieves 6.9x faster running efficiency.

翻译：当前端到端自动驾驶方法致力于为各类任务（如感知、预测与规划）构建统一的模块化设计。尽管通过完全可微分框架以规划导向方式进行优化，但现有端到端驾驶系统因缺乏以自我为中心的设计，仍受限于栅格化场景表征学习与冗余信息传递，导致性能欠佳与效率低下。本文提出一种以自我为中心的全稀疏范式EgoFSD，用于端到端自动驾驶。具体而言，EgoFSD包含稀疏感知、层级交互与迭代运动规划器。稀疏感知模块基于驾驶场景的稀疏表征执行检测与在线建图；层级交互模块借助几何先验知识，通过从粗到细的方式筛选路径最近动态车辆/静态障碍物；对于迭代运动规划器，同时考虑已筛选的交互智能体与自车进行联合运动预测，并通过迭代方式优化输出的多模态自车轨迹。此外，本研究引入位置级运动扩散与轨迹级规划去噪机制进行不确定性建模，从而提升训练稳定性与收敛速度。在nuScenes与Bench2Drive数据集上的大量实验表明，本方法较UniAD平均L2误差降低59%，碰撞率减少92%，同时运行效率提升6.9倍。