Laparoscopic liver surgery poses a complex intraoperative dynamic environment for surgeons, where remains a significant challenge to distinguish critical or even hidden structures inside the liver. Liver anatomical landmarks, e.g., ridge and ligament, serve as important markers for 2D-3D alignment, which can significantly enhance the spatial perception of surgeons for precise surgery. To facilitate the detection of laparoscopic liver landmarks, we collect a novel dataset called L3D, which comprises 1,152 frames with elaborated landmark annotations from surgical videos of 39 patients across two medical sites. For benchmarking purposes, 12 mainstream detection methods are selected and comprehensively evaluated on L3D. Further, we propose a depth-driven geometric prompt learning network, namely D2GPLand. Specifically, we design a Depth-aware Prompt Embedding (DPE) module that is guided by self-supervised prompts and generates semantically relevant geometric information with the benefit of global depth cues extracted from SAM-based features. Additionally, a Semantic-specific Geometric Augmentation (SGA) scheme is introduced to efficiently merge RGB-D spatial and geometric information through reverse anatomic perception. The experimental results indicate that D2GPLand obtains state-of-the-art performance on L3D, with 63.52% DICE and 48.68% IoU scores. Together with 2D-3D fusion technology, our method can directly provide the surgeon with intuitive guidance information in laparoscopic scenarios.
翻译:腹腔镜肝脏手术为外科医生呈现了复杂的术中动态环境,其中区分肝脏内部关键甚至隐藏结构仍存在重大挑战。肝脏解剖标志物(如肝嵴和韧带)作为二维-三维配准的重要标记,可显著增强外科医生的空间感知以实现精准手术。为促进腹腔镜肝脏标志物的检测,我们收集了一个名为L3D的新型数据集,该数据集包含来自两个医疗中心39例患者手术视频的1,152帧图像,并配有精细的标志物标注。为建立基准评估,我们选取了12种主流检测方法在L3D上进行了全面评估。进一步,我们提出了一种深度驱动的几何提示学习网络,命名为D2GPLand。具体而言,我们设计了深度感知提示嵌入模块,该模块通过自监督提示进行引导,并借助从SAM特征提取的全局深度线索生成语义相关的几何信息。此外,我们引入了语义特异性几何增强方案,通过逆向解剖感知有效融合RGB-D空间与几何信息。实验结果表明,D2GPLand在L3D数据集上取得了最先进的性能,DICE系数达63.52%,IoU分数为48.68%。结合二维-三维融合技术,我们的方法能够直接在腹腔镜场景中为外科医生提供直观的引导信息。