NRGS-SLAM：基于形变感知3D高斯泼溅的内窥镜单目非刚性SLAM (NRGS-SLAM: Monocular Non-Rigid SLAM for Endoscopy via Deformation-Aware 3D Gaussian Splatting)

Visual simultaneous localization and mapping (V-SLAM) is a fundamental capability for autonomous perception and navigation. However, endoscopic scenes violate the rigidity assumption due to persistent soft-tissue deformations, creating a strong coupling ambiguity between camera ego-motion and intrinsic deformation. Although recent monocular non-rigid SLAM methods have made notable progress, they often lack effective decoupling mechanisms and rely on sparse or low-fidelity scene representations, which leads to tracking drift and limited reconstruction quality. To address these limitations, we propose NRGS-SLAM, a monocular non-rigid SLAM system for endoscopy based on 3D Gaussian Splatting. To resolve the coupling ambiguity, we introduce a deformation-aware 3D Gaussian map that augments each Gaussian primitive with a learnable deformation probability, optimized via a Bayesian self-supervision strategy without requiring external non-rigidity labels. Building on this representation, we design a deformable tracking module that performs robust coarse-to-fine pose estimation by prioritizing low-deformation regions, followed by efficient per-frame deformation updates. A carefully designed deformable mapping module progressively expands and refines the map, balancing representational capacity and computational efficiency. In addition, a unified robust geometric loss incorporates external geometric priors to mitigate the inherent ill-posedness of monocular non-rigid SLAM. Extensive experiments on multiple public endoscopic datasets demonstrate that NRGS-SLAM achieves more accurate camera pose estimation (up to 50\% reduction in RMSE) and higher-quality photo-realistic reconstructions than state-of-the-art methods. Comprehensive ablation studies further validate the effectiveness of our key design choices. Source code will be publicly available upon paper acceptance.

翻译：视觉同步定位与建图（V-SLAM）是自主感知与导航的基础能力。然而，内窥镜场景因持续的软组织形变而违背了刚性假设，导致相机自身运动与内在形变之间存在强烈的耦合模糊性。尽管近期的单目非刚性SLAM方法已取得显著进展，但它们通常缺乏有效的解耦机制，并依赖于稀疏或低保真度的场景表示，从而导致跟踪漂移和有限的重建质量。为应对这些局限，我们提出了NRGS-SLAM，一种基于3D高斯泼溅的内窥镜单目非刚性SLAM系统。为解决耦合模糊性问题，我们引入了一种形变感知的3D高斯地图，该地图为每个高斯基元增强了一个可学习的形变概率，并通过贝叶斯自监督策略进行优化，无需外部非刚性标注。基于此表示，我们设计了一个可变形跟踪模块，该模块通过优先处理低形变区域来执行鲁棒的由粗到精的位姿估计，随后进行高效的逐帧形变更新。一个精心设计的可变形建图模块逐步扩展并优化地图，在表示能力与计算效率之间取得平衡。此外，一个统一的鲁棒几何损失函数整合了外部几何先验，以缓解单目非刚性SLAM固有的病态性。在多个公开内窥镜数据集上的大量实验表明，NRGS-SLAM相比现有最先进方法实现了更精确的相机位姿估计（均方根误差降低高达50%）和更高质量的光照真实感重建。全面的消融研究进一步验证了我们关键设计选择的有效性。源代码将在论文录用后公开。