EgoFSD：面向高效端到端自动驾驶的以自我为中心的全稀疏范式，结合不确定性去噪与迭代优化 (EgoFSD: Ego-Centric Fully Sparse Paradigm with Uncertainty Denoising and Iterative Refinement for Efficient End-to-End Self-Driving)

Current End-to-End Autonomous Driving (E2E-AD) methods resort to unifying modular designs for various tasks (e.g. perception, prediction and planning). Although optimized with a fully differentiable framework in a planning-oriented manner, existing end-to-end driving systems lacking ego-centric designs still suffer from unsatisfactory performance and inferior efficiency, due to rasterized scene representation learning and redundant information transmission. In this paper, we propose an ego-centric fully sparse paradigm, named EgoFSD, for end-to-end self-driving. Specifically, EgoFSD consists of sparse perception, hierarchical interaction and iterative motion planner. The sparse perception module performs detection and online mapping based on sparse representation of the driving scene. The hierarchical interaction module aims to select the Closest In-Path Vehicle / Stationary (CIPV / CIPS) from coarse to fine, benefiting from an additional geometric prior. As for the iterative motion planner, both selected interactive agents and ego-vehicle are considered for joint motion prediction, where the output multi-modal ego-trajectories are optimized in an iterative fashion. In addition, position-level motion diffusion and trajectory-level planning denoising are introduced for uncertainty modeling, thereby enhancing the training stability and convergence speed. Extensive experiments are conducted on nuScenes and Bench2Drive datasets, which significantly reduces the average L2 error by 59% and collision rate by 92% than UniAD while achieves 6.9x faster running efficiency.

翻译：当前端到端自动驾驶方法倾向于采用统一模块化设计来处理多种任务（如感知、预测与规划）。尽管现有端到端驾驶系统在规划导向的完全可微分框架下进行优化，但由于采用栅格化场景表示学习及冗余信息传递，且缺乏以自我为中心的设计，其性能与效率仍不尽如人意。本文提出一种以自我为中心的全稀疏范式——EgoFSD，用于端到端自动驾驶。具体而言，EgoFSD包含稀疏感知、层次化交互与迭代运动规划器。稀疏感知模块基于驾驶场景的稀疏表示执行检测与在线建图。层次化交互模块借助额外的几何先验，通过从粗到细的方式筛选最近路径内车辆/静止物体。对于迭代运动规划器，同时考虑所选交互智能体与自车进行联合运动预测，并以迭代方式对输出的多模态自车轨迹进行优化。此外，通过引入位置级运动扩散与轨迹级规划去噪进行不确定性建模，从而提升训练稳定性与收敛速度。在nuScenes与Bench2Drive数据集上的大量实验表明，相较于UniAD，本方法平均L2误差降低59%，碰撞率降低92%，同时运行效率提升6.9倍。