Simultaneous Localization And Mapping (SLAM) from a monocular endoscopy video can enable autonomous navigation, guidance to unsurveyed regions, and 3D visualizations, which can significantly improve endoscopy experience for surgeons and patient outcomes. Existing dense SLAM algorithms often assume distant and static lighting and textured surfaces, and alternate between optimizing scene geometry and camera parameters by minimizing a photometric rendering loss, often called Photometric Bundle Adjustment. However, endoscopic environments exhibit dynamic near-field lighting due to the co-located light and camera moving extremely close to the surface, textureless surfaces, and strong specular reflections due to mucus layers. When not considered, these near-field lighting effects can cause significant performance reductions for existing SLAM algorithms from indoor/outdoor scenes when applied to endoscopy videos. To mitigate this problem, we introduce a new Near-Field Lighting Bundle Adjustment Loss $(L_{NFL-BA})$ that can also be alternatingly optimized, along with the Photometric Bundle Adjustment loss, such that the captured images' intensity variations match the relative distance and orientation between the surface and the co-located light and camera. We derive a general NFL-BA loss function for 3D Gaussian surface representations and demonstrate that adding $L_{NFL-BA}$ can significantly improve the tracking and mapping performance of two state-of-the-art 3DGS-SLAM systems, MonoGS (35% improvement in tracking, 48% improvement in mapping with predicted depth maps) and EndoGSLAM (22% improvement in tracking, marginal improvement in mapping with predicted depths), on the C3VD endoscopy dataset for colons. The project page is available at https://asdunnbe.github.io/NFL-BA/
翻译:从单目内窥镜视频进行同时定位与建图(SLAM)能够实现自主导航、引导至未探查区域以及三维可视化,从而显著改善外科医生的内窥镜操作体验和患者预后。现有的稠密SLAM算法通常假设光源遥远且静态、表面具有纹理,并通过最小化光度渲染损失(常称为光度束法平差)交替优化场景几何与相机参数。然而,内窥镜环境由于光源与相机共位且极其贴近表面移动,呈现出动态的近场光照、无纹理表面以及因黏液层导致的强烈镜面反射。若不加以考虑,这些近场光照效应会导致现有适用于室内/室外场景的SLAM算法在内窥镜视频上应用时性能显著下降。为缓解此问题,我们引入了一种新的近场光照束法平差损失函数$(L_{NFL-BA})$,该损失可与光度束法平差损失交替优化,使得捕获图像的强度变化与表面及共位光源-相机之间的相对距离和方向相匹配。我们推导了适用于三维高斯表面表示的通用NFL-BA损失函数,并证明在C3VD结肠内窥镜数据集上,添加$L_{NFL-BA}$能显著提升两种先进3DGS-SLAM系统——MonoGS(跟踪性能提升35%,使用预测深度图时建图性能提升48%)与EndoGSLAM(跟踪性能提升22%,使用预测深度时建图性能略有提升)的跟踪与建图性能。项目页面详见 https://asdunnbe.github.io/NFL-BA/。