Reconstructing deformable tissues from endoscopic stereo videos in robotic surgery is crucial for various clinical applications. However, existing methods relying only on implicit representations are computationally expensive and require dozens of hours, which limits further practical applications. To address this challenge, we introduce LerPlane, a novel method for fast and accurate reconstruction of surgical scenes under a single-viewpoint setting. LerPlane treats surgical procedures as 4D volumes and factorizes them into explicit 2D planes of static and dynamic fields, leading to a compact memory footprint and significantly accelerated optimization. The efficient factorization is accomplished by fusing features obtained through linear interpolation of each plane and enables using lightweight neural networks to model surgical scenes. Besides, LerPlane shares static fields, significantly reducing the workload of dynamic tissue modeling. We also propose a novel sample scheme to boost optimization and improve performance in regions with tool occlusion and large motions. Experiments on DaVinci robotic surgery videos demonstrate that LerPlane accelerates optimization by over 100$\times$ while maintaining high quality across various non-rigid deformations, showing significant promise for future intraoperative surgery applications.
翻译:从机器人手术内窥镜立体视频中重建变形组织对多种临床应用至关重要。然而,现有仅依赖隐式表示的方法计算成本高昂且需数十小时,限制了实际应用。为解决这一挑战,我们提出LerPlane——一种在单视点设置下实现快速准确外科场景重建的新方法。LerPlane将手术过程视为4D体积,并将其分解为静态场与动态场的显式2D平面,从而显著压缩内存占用并大幅加速优化。通过融合各平面线性插值所得特征完成高效分解,并支持使用轻量级神经网络建模外科场景。此外,LerPlane通过共享静态场,显著降低动态组织建模的工作量。我们进一步提出新型采样方案,以提升工具遮挡与大运动区域的优化性能。在达芬奇机器人手术视频上的实验表明,LerPlane在各类非刚性形变下仍能保持高质量重建,同时实现超过100倍的优化加速,为未来术中手术应用展现了显著前景。