In minimally invasive surgery, surgical instrument localization is a crucial task for endoscopic videos, which enables various applications for improving surgical outcomes. However, annotating the instrument localization in endoscopic videos is tedious and labor-intensive. In contrast, obtaining the category information is easy and efficient in real-world applications. To fully utilize the category information and address the localization problem, we propose a weakly supervised localization framework named WS-YOLO for surgical instruments. By leveraging the instrument category information as the weak supervision, our WS-YOLO framework adopts an unsupervised multi-round training strategy for the localization capability training. We validate our WS-YOLO framework on the Endoscopic Vision Challenge 2023 dataset, which achieves remarkable performance in the weakly supervised surgical instrument localization. The source code is available at https://github.com/Breezewrf/WS-YOLO.
翻译:在微创手术中,手术器械定位是内窥镜视频分析的一项关键任务,对于改善手术效果具有多种应用价值。然而,对内窥镜视频中的器械进行定位标注通常繁琐且耗时。相比之下,在实际应用中获取器械的类别信息则更为简便高效。为充分利用类别信息并解决定位问题,我们提出了一种用于手术器械的弱监督定位框架WS-YOLO。通过将器械类别信息作为弱监督信号,WS-YOLO框架采用无监督多轮训练策略来培养定位能力。我们在Endoscopic Vision Challenge 2023数据集上验证了WS-YOLO框架,其在弱监督手术器械定位任务中取得了显著性能。源代码公开于https://github.com/Breezewrf/WS-YOLO。