As a crucial and intricate task in robotic minimally invasive surgery, reconstructing surgical scenes using stereo or monocular endoscopic video holds immense potential for clinical applications. NeRF-based techniques have recently garnered attention for the ability to reconstruct scenes implicitly. On the other hand, Gaussian splatting-based 3D-GS represents scenes explicitly using 3D Gaussians and projects them onto a 2D plane as a replacement for the complex volume rendering in NeRF. However, these methods face challenges regarding surgical scene reconstruction, such as slow inference, dynamic scenes, and surgical tool occlusion. This work explores and reviews state-of-the-art (SOTA) approaches, discussing their innovations and implementation principles. Furthermore, we replicate the models and conduct testing and evaluation on two datasets. The test results demonstrate that with advancements in these techniques, achieving real-time, high-quality reconstructions becomes feasible.
翻译:作为机器人微创手术中一项关键且复杂的任务,利用立体或单目内窥镜视频重建手术场景具有巨大的临床应用潜力。基于NeRF的技术因其能够隐式重建场景而近期备受关注。另一方面,基于高斯泼溅的3D-GS使用三维高斯函数显式表示场景,并将其投影到二维平面上,以替代NeRF中复杂的体渲染过程。然而,这些方法在手术场景重建方面仍面临诸多挑战,如推理速度慢、动态场景处理以及手术器械遮挡等问题。本文系统性地探索并综述了当前最先进的方法,讨论了其创新点与实现原理。此外,我们复现了相关模型并在两个数据集上进行了测试与评估。测试结果表明,随着这些技术的进步,实现实时高质量重建已成为可能。