Semantic understanding plays a crucial role in Dense Simultaneous Localization and Mapping (SLAM), facilitating comprehensive scene interpretation. Recent advancements that integrate Gaus- sian Splatting into SLAM systems have demonstrated its effectiveness in generating high-quality renderings through the use of explicit 3D Gaussian representations. Building on this progress, we propose SGS-SLAM, the first semantic dense visual SLAM system grounded in 3D Gaussians, which provides precise 3D semantic segmentation alongside high-fidelity reconstructions. Specifically, we propose to employ multi-channel optimization during the mapping process, integrating appearance, geometric, and semantic constraints with key-frame optimization to enhance reconstruction quality. Extensive experiments demonstrate that SGS-SLAM delivers state-of-the-art performance in camera pose estimation, map reconstruction, and semantic segmentation, outperforming existing methods meanwhile preserving real-time rendering ability.
翻译:语义理解在密集同时定位与地图构建(SLAM)中起着关键作用,能够促进对场景的全面解读。近期将高斯溅射(Gaussian Splatting)集成到SLAM系统中的进展表明,利用显式三维高斯表示可以生成高质量渲染。基于这一进展,我们提出了SGS-SLAM,这是首个基于三维高斯的语义密集视觉SLAM系统,能够在提供高保真重建的同时实现精准的三维语义分割。具体而言,我们建议在建图过程中采用多通道优化,融合外观、几何和语义约束,并结合关键帧优化以提升重建质量。大量实验证明,SGS-SLAM在相机位姿估计、地图重建和语义分割方面均达到最先进性能,在保持实时渲染能力的同时优于现有方法。