Autonomous driving systems require a good understanding of surrounding environments, including moving obstacles and static High-Definition (HD) semantic map elements. Existing methods approach the semantic map problem by offline manual annotation, which suffers from serious scalability issues. Recent learning-based methods produce dense rasterized segmentation predictions to construct maps. However, these predictions do not include instance information of individual map elements and require heuristic post-processing to obtain vectorized maps. To tackle these challenges, we introduce an end-to-end vectorized HD map learning pipeline, termed VectorMapNet. VectorMapNet takes onboard sensor observations and predicts a sparse set of polylines in the bird's-eye view. This pipeline can explicitly model the spatial relation between map elements and generate vectorized maps that are friendly to downstream autonomous driving tasks. Extensive experiments show that VectorMapNet achieve strong map learning performance on both nuScenes and Argoverse2 dataset, surpassing previous state-of-the-art methods by 14.2 mAP and 14.6mAP. Qualitatively, we also show that VectorMapNet is capable of generating comprehensive maps and capturing more fine-grained details of road geometry. To the best of our knowledge, VectorMapNet is the first work designed towards end-to-end vectorized map learning from onboard observations. Our project website is available at https://tsinghua-mars-lab.github.io/vectormapnet/.
翻译:自动驾驶系统需要准确理解周围环境,包括移动障碍物和静态高清(HD)语义地图元素。现有方法通过离线人工标注解决语义地图问题,但这存在严重的可扩展性限制。近年基于学习的方法生成密集栅格化分割预测来构建地图,但此类预测不包含单个地图元素的实例信息,且需启发式后处理才能获得矢量化地图。为解决这些挑战,我们提出了一种名为VectorMapNet的端到端矢量化高清地图学习流程。VectorMapNet利用车载传感器观测数据,在鸟瞰视角下预测稀疏的多段线集合。该流程能显式建模地图元素间的空间关系,生成有利于下游自动驾驶任务的矢量化地图。大量实验表明,VectorMapNet在nuScenes和Argoverse2数据集上均取得了强大的地图学习性能,分别超越此前最优方法14.2 mAP和14.6 mAP。定性分析显示,VectorMapNet能生成完整地图并捕捉道路几何的细粒度细节。据我们所知,VectorMapNet是首个针对车载观测数据端到端矢量化地图学习的工作。项目网站:https://tsinghua-mars-lab.github.io/vectormapnet/。