Geometry foundation models have significantly advanced dense geometric SLAM, yet existing systems often lack deep semantic understanding and robust loop closure capabilities. Meanwhile, contemporary semantic mapping approaches are frequently hindered by decoupled architectures and fragile data association. We propose IRIS-SLAM, a novel RGB semantic SLAM system that leverages unified geometric-instance representations derived from an instance-extended foundation model. By extending a geometry foundation model to concurrently predict dense geometry and cross-view consistent instance embeddings, we enable a semantic-synergized association mechanism and instance-guided loop closure detection. Our approach effectively utilizes viewpoint-agnostic semantic anchors to bridge the gap between geometric reconstruction and open-vocabulary mapping. Experimental results demonstrate that IRIS-SLAM significantly outperforms state-of-the-art methods, particularly in map consistency and wide-baseline loop closure reliability.
翻译:几何基础模型极大地推动了稠密几何SLAM的发展,然而现有系统通常缺乏深层的语义理解与鲁棒的闭环能力。与此同时,当代语义建图方法常受限于解耦的架构与脆弱的数据关联。我们提出了IRIS-SLAM,一种新颖的RGB语义SLAM系统,它利用源自实例扩展基础模型的统一几何-实例表征。通过扩展一个几何基础模型,使其能够同时预测稠密几何与跨视角一致的实例嵌入,我们实现了一种语义协同关联机制与实例引导的闭环检测。我们的方法有效利用了与视角无关的语义锚点,以弥合几何重建与开放词汇建图之间的鸿沟。实验结果表明,IRIS-SLAM在性能上显著优于现有最先进方法,尤其是在地图一致性与宽基线闭环可靠性方面。