Object-based maps are relevant for scene understanding since they integrate geometric and semantic information of the environment, allowing autonomous robots to robustly localize and interact with on objects. In this paper, we address the task of constructing a metric-semantic map for the purpose of long-term object-based localization. We exploit 3D object detections from monocular RGB frames for both, the object-based map construction, and for globally localizing in the constructed map. To tailor the approach to a target environment, we propose an efficient way of generating 3D annotations to finetune the 3D object detection model. We evaluate our map construction in an office building, and test our long-term localization approach on challenging sequences recorded in the same environment over nine months. The experiments suggest that our approach is suitable for constructing metric-semantic maps, and that our localization approach is robust to long-term changes. Both, the mapping algorithm and the localization pipeline can run online on an onboard computer. We will release an open-source C++/ROS implementation of our approach.
翻译:基于对象的建图在场景理解中具有重要意义,因为它整合了环境的几何与语义信息,使自主机器人能够稳健地定位并与物体交互。本文旨在构建一种度量-语义地图,用于实现长期的基于对象的定位。我们利用单目RGB帧中的3D物体检测结果,既用于构建基于对象的建图,也用于在已构建的地图中进行全局定位。为将该方法适配至目标环境,我们提出了一种高效的3D标注生成方法,用于微调3D物体检测模型。我们在某办公楼中评估了地图构建效果,并在同一环境中连续九个月记录的具有挑战性的序列上测试了长期定位方法。实验表明,我们的方法适用于构建度量-语义地图,且定位方法对长期环境变化具有鲁棒性。建图算法与定位流程均可在机载计算机上在线运行。我们将开源本方法的C++/ROS实现。