We propose SemGauss-SLAM, a dense semantic SLAM system utilizing 3D Gaussian representation, that enables accurate 3D semantic mapping, robust camera tracking, and high-quality rendering simultaneously. In this system, we incorporate semantic feature embedding into 3D Gaussian representation, which effectively encodes semantic information within the spatial layout of the environment for precise semantic scene representation. Furthermore, we propose feature-level loss for updating 3D Gaussian representation, enabling higher-level guidance for 3D Gaussian optimization. In addition, to reduce cumulative drift in tracking and improve semantic reconstruction accuracy, we introduce semantic-informed bundle adjustment leveraging multi-frame semantic associations for joint optimization of 3D Gaussian representation and camera poses, leading to low-drift tracking and accurate mapping. Our SemGauss-SLAM method demonstrates superior performance over existing radiance field-based SLAM methods in terms of mapping and tracking accuracy on Replica and ScanNet datasets, while also showing excellent capabilities in high-precision semantic segmentation and dense semantic mapping.
翻译:我们提出了SemGauss-SLAM,一种利用3D高斯表示的稠密语义SLAM系统,能够同时实现精确的3D语义建图、鲁棒的相机跟踪以及高质量的渲染。在该系统中,我们将语义特征嵌入到3D高斯表示中,从而在环境的空间布局内有效编码语义信息,以实现精确的语义场景表示。此外,我们提出了用于更新3D高斯表示的特征级损失,为3D高斯优化提供了更高层次的指导。另外,为了减少跟踪中的累积漂移并提高语义重建精度,我们引入了基于语义信息的束调整方法,该方法利用多帧语义关联对3D高斯表示和相机位姿进行联合优化,从而实现低漂移跟踪与精确建图。我们的SemGauss-SLAM方法在Replica和ScanNet数据集上的建图与跟踪精度方面,均优于现有的基于辐射场的SLAM方法,同时在高精度语义分割和稠密语义建图方面也展现出卓越的性能。