Surgical tool detection is essential for analyzing and evaluating minimally invasive surgery videos. Current approaches are mostly based on supervised methods that require large, fully instance-level labels (i.e., bounding boxes). However, large image datasets with instance-level labels are often limited because of the burden of annotation. Thus, surgical tool detection is important when providing image-level labels instead of instance-level labels since image-level annotations are considerably more time-efficient than instance-level annotations. In this work, we propose to strike a balance between the extremely costly annotation burden and detection performance. We further propose a co-occurrence loss, which considers a characteristic that some tool pairs often co-occur together in an image to leverage image-level labels. Encapsulating the knowledge of co-occurrence using the co-occurrence loss helps to overcome the difficulty in classification that originates from the fact that some tools have similar shapes and textures. Extensive experiments conducted on the Endovis2018 dataset in various data settings show the effectiveness of our method.
翻译:手术工具检测对于分析和评估微创手术视频至关重要。当前方法主要基于监督学习方法,需要大量完整的实例级标签(即边界框)。然而,由于标注负担过重,具有实例级标签的大规模图像数据集通常十分有限。因此,当提供图像级标签而非实例级标签时,手术工具检测具有重要意义,因为图像级标注在时间效率上远优于实例级标注。在本研究中,我们提出在极其昂贵的标注负担与检测性能之间取得平衡。进一步地,我们提出了共现损失函数,该损失函数利用了某些工具对经常在图像中共同出现的特征,以充分利用图像级标签。通过共现损失封装共现知识,有助于克服因某些工具具有相似形状和纹理而产生的分类困难。在不同数据设置下,基于Endovis2018数据集进行的广泛实验证明了我们方法的有效性。