Visual simultaneous localization and mapping (V-SLAM) is a fundamental capability for autonomous perception and navigation. However, endoscopic scenes violate the rigidity assumption due to persistent soft-tissue deformations, creating a strong coupling ambiguity between camera ego-motion and intrinsic deformation. Although recent monocular non-rigid SLAM methods have made notable progress, they often lack effective decoupling mechanisms and rely on sparse or low-fidelity scene representations, which leads to tracking drift and limited reconstruction quality. To address these limitations, we propose NRGS-SLAM, a monocular non-rigid SLAM system for endoscopy based on 3D Gaussian Splatting. To resolve the coupling ambiguity, we introduce a deformation-aware 3D Gaussian map that augments each Gaussian primitive with a learnable deformation probability, optimized via a Bayesian self-supervision strategy without requiring external non-rigidity labels. Building on this representation, we design a deformable tracking module that performs robust coarse-to-fine pose estimation by prioritizing low-deformation regions, followed by efficient per-frame deformation updates. A carefully designed deformable mapping module progressively expands and refines the map, balancing representational capacity and computational efficiency. In addition, a unified robust geometric loss incorporates external geometric priors to mitigate the inherent ill-posedness of monocular non-rigid SLAM. Extensive experiments on multiple public endoscopic datasets demonstrate that NRGS-SLAM achieves more accurate camera pose estimation (up to 50\% reduction in RMSE) and higher-quality photo-realistic reconstructions than state-of-the-art methods. Comprehensive ablation studies further validate the effectiveness of our key design choices. Source code will be publicly available upon paper acceptance.
翻译:视觉同步定位与建图(V-SLAM)是自主感知与导航的基础能力。然而,内窥镜场景因持续的软组织形变而违背了刚性假设,导致相机自身运动与内在形变之间存在强烈的耦合模糊性。尽管近期的单目非刚性SLAM方法已取得显著进展,但它们通常缺乏有效的解耦机制,并依赖于稀疏或低保真度的场景表示,从而导致跟踪漂移和有限的重建质量。为应对这些局限,我们提出了NRGS-SLAM,一种基于3D高斯泼溅的内窥镜单目非刚性SLAM系统。为解决耦合模糊性问题,我们引入了一种形变感知的3D高斯地图,该地图为每个高斯基元增强了一个可学习的形变概率,并通过贝叶斯自监督策略进行优化,无需外部非刚性标注。基于此表示,我们设计了一个可变形跟踪模块,该模块通过优先处理低形变区域来执行鲁棒的由粗到精的位姿估计,随后进行高效的逐帧形变更新。一个精心设计的可变形建图模块逐步扩展并优化地图,在表示能力与计算效率之间取得平衡。此外,一个统一的鲁棒几何损失函数整合了外部几何先验,以缓解单目非刚性SLAM固有的病态性。在多个公开内窥镜数据集上的大量实验表明,NRGS-SLAM相比现有最先进方法实现了更精确的相机位姿估计(均方根误差降低高达50%)和更高质量的光照真实感重建。全面的消融研究进一步验证了我们关键设计选择的有效性。源代码将在论文录用后公开。