We propose a novel end-to-end pipeline for online long-range vectorized high-definition (HD) map construction using on-board camera sensors. The vectorized representation of HD maps, employing polylines and polygons to represent map elements, is widely used by downstream tasks. However, previous schemes designed with reference to dynamic object detection overlook the structural constraints within linear map elements, resulting in performance degradation in long-range scenarios. In this paper, we exploit the properties of map elements to improve the performance of map construction. We extract more accurate bird's eye view (BEV) features guided by their linear structure, and then propose a hierarchical sparse map representation to further leverage the scalability of vectorized map elements and design a progressive decoding mechanism and a supervision strategy based on this representation. Our approach, ScalableMap, demonstrates superior performance on the nuScenes dataset, especially in long-range scenarios, surpassing previous state-of-the-art model by 6.5 mAP while achieving 18.3 FPS. Code is available at https://github.com/jingy1yu/ScalableMap.
翻译:我们提出了一种新颖的端到端流水线,用于利用车载摄像头传感器在线构建长距离矢量高清地图。采用折线和多边形表示地图元素的矢量高清地图表征方式,被下游任务广泛使用。然而,以往借鉴动态目标检测设计的方案忽略了线性地图元素内的结构约束,导致长距离场景下的性能下降。本文利用地图元素的特性来提升地图构建性能。我们提取更精确的鸟瞰视角特征,并以其线性结构为引导;随后提出一种层次化稀疏地图表征,进一步利用矢量地图元素的可扩展性,并基于此表征设计了渐进式解码机制与监督策略。所提方法ScalableMap在nuScenes数据集上展现出卓越性能,尤其在长距离场景中,以18.3 FPS的推理速度超越当前最先进模型6.5 mAP。代码开源于https://github.com/jingy1yu/ScalableMap。