Object-based maps are relevant for scene understanding since they integrate geometric and semantic information of the environment, allowing autonomous robots to robustly localize and interact with on objects. In this paper, we address the task of constructing a metric-semantic map for the purpose of long-term object-based localization. We exploit 3D object detections from monocular RGB frames for both, the object-based map construction, and for globally localizing in the constructed map. To tailor the approach to a target environment, we propose an efficient way of generating 3D annotations to finetune the 3D object detection model. We evaluate our map construction in an office building, and test our long-term localization approach on challenging sequences recorded in the same environment over nine months. The experiments suggest that our approach is suitable for constructing metric-semantic maps, and that our localization approach is robust to long-term changes. Both, the mapping algorithm and the localization pipeline can run online on an onboard computer. We release an open-source C++/ROS implementation of our approach.
翻译:基于物体的地图对于场景理解具有重要意义,因为它们整合了环境的几何和语义信息,使自主机器人能够稳健地定位并与物体交互。本文针对构建用于长期物体定位的度量语义地图任务展开研究。我们利用单目RGB帧中的3D物体检测结果,同时进行基于物体的地图构建和在已构建地图中的全局定位。为使方法适应目标环境,我们提出了一种高效生成3D标注以微调3D物体检测模型的方法。我们在办公建筑中评估地图构建效果,并在同一环境九个月内记录的具有挑战性的序列上测试长期定位方法。实验表明,本方法适用于构建度量语义地图,且定位方法对长期环境变化具有鲁棒性。地图构建算法和定位流程均可在线运行于机载计算机上。我们开源了本方法的C++/ROS实现。