Accurate perception of objects in the environment is important for improving the scene understanding capability of SLAM systems. In robotic and augmented reality applications, object maps with semantic and metric information show attractive advantages. In this paper, we present RO-MAP, a novel multi-object mapping pipeline that does not rely on 3D priors. Given only monocular input, we use neural radiance fields to represent objects and couple them with a lightweight object SLAM based on multi-view geometry, to simultaneously localize objects and implicitly learn their dense geometry. We create separate implicit models for each detected object and train them dynamically and in parallel as new observations are added. Experiments on synthetic and real-world datasets demonstrate that our method can generate semantic object map with shape reconstruction, and be competitive with offline methods while achieving real-time performance (25Hz). The code and dataset will be available at: https://github.com/XiaoHan-Git/RO-MAP
翻译:准确的物体感知对于提升SLAM系统的场景理解能力至关重要。在机器人和增强现实应用中,包含语义与度量信息的物体地图展现出显著优势。本文提出RO-MAP——一种无需三维先验知识的新型多目标建图流程。仅凭单目输入,我们采用神经辐射场表征物体,并与基于多视图几何的轻量级物体SLAM系统耦合,实现物体同步定位与密集几何隐式学习。我们为每个检测到的物体创建独立隐式模型,并在新观测数据加入时进行动态并行训练。在合成数据集与真实数据集上的实验表明,本方法可生成具有形状重建能力的语义物体地图,在达到实时性能(25Hz)的同时与离线方法具有竞争力。代码与数据集将发布于:https://github.com/XiaoHan-Git/RO-MAP