In robot-assisted minimally invasive surgery, accurate 3D reconstruction from endoscopic video is vital for downstream tasks and improved outcomes. However, endoscopic scenarios present unique challenges, including photometric inconsistencies, non-rigid tissue motion, and view-dependent highlights. Most 3DGS-based methods that rely solely on appearance constraints for optimizing 3DGS are often insufficient in this context, as these dynamic visual artifacts can mislead the optimization process and lead to inaccurate reconstructions. To address these limitations, we present EndoWave, a unified spatio-temporal Gaussian Splatting framework by incorporating an optical flow-based geometric constraint and a multi-resolution rational wavelet supervision. First, we adopt a unified spatio-temporal Gaussian representation that directly optimizes primitives in a 4D domain. Second, we propose a geometric constraint derived from optical flow to enhance temporal coherence and effectively constrain the 3D structure of the scene. Third, we propose a multi-resolution rational orthogonal wavelet as a constraint, which can effectively separate the details of the endoscope and enhance the rendering performance. Extensive evaluations on two real surgical datasets, EndoNeRF and StereoMIS, demonstrate that our method EndoWave achieves state-of-the-art reconstruction quality and visual accuracy compared to the baseline method.
翻译:在机器人辅助的微创手术中,基于内窥镜视频的精确三维重建对于下游任务和改善手术效果至关重要。然而,内窥镜场景存在独特的挑战,包括光度不一致性、非刚性组织运动以及视角依赖的高光。大多数仅依赖外观约束来优化三维高斯溅射(3DGS)的方法在此背景下往往不足,因为这些动态视觉伪影可能误导优化过程,导致重建结果不准确。为应对这些局限性,我们提出了EndoWave,一个通过结合基于光流的几何约束和多分辨率有理小波监督的统一时空高斯溅射框架。首先,我们采用统一的时空高斯表示,直接在四维域中优化基元。其次,我们提出一种从光流导出的几何约束,以增强时间一致性并有效约束场景的三维结构。第三,我们提出使用多分辨率有理正交小波作为约束,能够有效分离内窥镜的细节并提升渲染性能。在EndoNeRF和StereoMIS两个真实手术数据集上的广泛评估表明,与基线方法相比,我们的方法EndoWave在重建质量和视觉精度上达到了最先进的水平。