We propose a novel visual SLAM method that integrates text objects tightly by treating them as semantic features via fully exploring their geometric and semantic prior. The text object is modeled as a texture-rich planar patch whose semantic meaning is extracted and updated on the fly for better data association. With the full exploration of locally planar characteristics and semantic meaning of text objects, the SLAM system becomes more accurate and robust even under challenging conditions such as image blurring, large viewpoint changes, and significant illumination variations (day and night). We tested our method in various scenes with the ground truth data. The results show that integrating texture features leads to a more superior SLAM system that can match images across day and night. The reconstructed semantic 3D text map could be useful for navigation and scene understanding in robotic and mixed reality applications. Our project page: https://github.com/SJTU-ViSYS/TextSLAM .
翻译:我们提出了一种新型视觉SLAM方法,通过充分探索文本对象的几何与语义先验,将其视为语义特征进行紧密集成。文本对象被建模为纹理丰富的平面块,其语义信息在运行过程中被实时提取与更新,以实现更优的数据关联。通过充分挖掘文本对象的局部平面特性及语义含义,即使在图像模糊、大视角变化和显著光照变化(昼夜交替)等挑战性条件下,该系统仍能保持更高的精度与鲁棒性。我们在多种场景中利用真实数据对方法进行了测试,结果表明,融合纹理特征构建的SLAM系统性能更优越,可实现昼夜图像匹配。重建的语义三维文本地图可用于机器人及混合现实应用中的导航与场景理解。项目主页:https://github.com/SJTU-ViSYS/TextSLAM 。