Reconstruction of endoscopic scenes is an important asset for various medical applications, from post-surgery analysis to educational training. Neural rendering has recently shown promising results in endoscopic reconstruction with deforming tissue. However, the setup has been restricted to a static endoscope, limited deformation, or required an external tracking device to retrieve camera pose information of the endoscopic camera. With FLex we adress the challenging setup of a moving endoscope within a highly dynamic environment of deforming tissue. We propose an implicit scene separation into multiple overlapping 4D neural radiance fields (NeRFs) and a progressive optimization scheme jointly optimizing for reconstruction and camera poses from scratch. This improves the ease-of-use and allows to scale reconstruction capabilities in time to process surgical videos of 5,000 frames and more; an improvement of more than ten times compared to the state of the art while being agnostic to external tracking information. Extensive evaluations on the StereoMIS dataset show that FLex significantly improves the quality of novel view synthesis while maintaining competitive pose accuracy.
翻译:内窥镜场景重建对于从术后分析到教育训练等多种医学应用具有重要价值。神经渲染技术最近在含变形组织的内窥镜重建中展现出良好前景,但现有方案受限于静态内窥镜、有限形变或需要外部追踪设备获取内窥镜相机姿态信息。针对移动内窥镜在高度动态变形组织环境中的挑战性场景,我们提出FLex方法,通过将隐式场景分解为多个重叠的4D神经辐射场(NeRFs),并采用渐进式优化方案从零开始联合优化重建与相机姿态。该方法提升了易用性,使得重建能力可扩展至处理5000帧及以上的手术视频——相较于现有技术提升超过十倍,且无需依赖外部追踪信息。在StereoMIS数据集上的广泛评估表明,FLex在显著提升新视角合成质量的同时,保持了具有竞争力的姿态估计精度。