Accurate perception of objects in the environment is important for improving the scene understanding capability of SLAM systems. In robotic and augmented reality applications, object maps with semantic and metric information show attractive advantages. In this paper, we present RO-MAP, a novel multi-object mapping pipeline that does not rely on 3D priors. Given only monocular input, we use neural radiance fields to represent objects and couple them with a lightweight object SLAM based on multi-view geometry, to simultaneously localize objects and implicitly learn their dense geometry. We create separate implicit models for each detected object and train them dynamically and in parallel as new observations are added. Experiments on synthetic and real-world datasets demonstrate that our method can generate semantic object map with shape reconstruction, and be competitive with offline methods while achieving real-time performance (25Hz). The code and dataset will be available at: https://github.com/XiaoHan-Git/RO-MAP
翻译:环境中的物体精确感知对于提升SLAM系统的场景理解能力至关重要。在机器人及增强现实应用中,包含语义与度量信息的物体地图展现出显著优势。本文提出RO-MAP——一种无需三维先验的新型多物体映射管线。仅依靠单目输入,我们利用神经辐射场表征物体,并将其与基于多视图几何的轻量级物体SLAM系统耦合,实现物体实时定位与稠密几何结构的隐式学习。针对每个检测到的物体创建独立的隐式模型,并在新观测数据加入时进行动态并行训练。合成数据集与真实世界数据集上的实验表明,本方法可生成包含形状重建的语义物体地图,在达到实时性能(25Hz)的同时能与离线方法相媲美。代码与数据集将发布于:https://github.com/XiaoHan-Git/RO-MAP