The recent development of online static map element (a.k.a. HD Map) construction algorithms has raised a vast demand for data with ground truth annotations. However, available public datasets currently cannot provide high-quality training data regarding consistency and accuracy. To this end, we present CAMA: a vision-centric approach for Consistent and Accurate Map Annotation. Without LiDAR inputs, our proposed framework can still generate high-quality 3D annotations of static map elements. Specifically, the annotation can achieve high reprojection accuracy across all surrounding cameras and is spatial-temporal consistent across the whole sequence. We apply our proposed framework to the popular nuScenes dataset to provide efficient and highly accurate annotations. Compared with the original nuScenes static map element, models trained with annotations from CAMA achieve lower reprojection errors (e.g., 4.73 vs. 8.03 pixels).
翻译:近年来,在线静态地图元素(又称高精地图)构建算法的快速发展,催生了对带有真实标注数据的大量需求。然而,当前可用的公开数据集在一致性和准确性方面尚无法提供高质量的训练数据。为此,我们提出CAMA:一种用于一致且准确地图标注的视觉中心方法。无需激光雷达输入,我们提出的框架仍能生成静态地图元素的高质量三维标注。具体而言,该标注能在所有环视摄像头中实现高重投影精度,并确保整个序列在时空上保持一致。我们将所提框架应用于广受欢迎的nuScenes数据集,以提供高效且高精度的标注。与原始nuScenes静态地图元素相比,使用CAMA标注训练的模型实现了更低的重投影误差(例如4.73像素对比8.03像素)。