PWISeg: Point-based Weakly-supervised Instance Segmentation for Surgical Instruments

In surgical procedures, correct instrument counting is essential. Instance segmentation is a location method that locates not only an object's bounding box but also each pixel's specific details. However, obtaining mask-level annotations is labor-intensive in instance segmentation. To address this issue, we propose a novel yet effective weakly-supervised surgical instrument instance segmentation approach, named Point-based Weakly-supervised Instance Segmentation (PWISeg). PWISeg adopts an FCN-based architecture with point-to-box and point-to-mask branches to model the relationships between feature points and bounding boxes, as well as feature points and segmentation masks on FPN, accomplishing instrument detection and segmentation jointly in a single model. Since mask level annotations are hard to available in the real world, for point-to-mask training, we introduce an unsupervised projection loss, utilizing the projected relation between predicted masks and bboxes as supervision signal. On the other hand, we annotate a few pixels as the key pixel for each instrument. Based on this, we further propose a key pixel association loss and a key pixel distribution loss, driving the point-to-mask branch to generate more accurate segmentation predictions. To comprehensively evaluate this task, we unveil a novel surgical instrument dataset with manual annotations, setting up a benchmark for further research. Our comprehensive research trial validated the superior performance of our PWISeg. The results show that the accuracy of surgical instrument segmentation is improved, surpassing most methods of instance segmentation via weakly supervised bounding boxes. This improvement is consistently observed in our proposed dataset and when applied to the public HOSPI-Tools dataset.

翻译：在手术过程中，器械的正确计数至关重要。实例分割是一种定位方法，不仅能定位物体的边界框，还能获取每个像素的详细信息。然而，在实例分割中获取掩码级别的标注需要大量人工。为解决这一问题，我们提出了一种新颖且有效的弱监督手术器械实例分割方法，命名为基于点的弱监督实例分割（PWISeg）。PWISeg采用基于FCN的架构，包含点到框和点到掩码两个分支，以建模特征点与边界框以及特征点与FPN上分割掩码之间的关系，在单一模型中联合完成器械检测与分割。由于现实世界中掩码级标注难以获取，针对点到掩码的训练，我们引入了无监督投影损失，利用预测掩码与边界框之间的投影关系作为监督信号。另一方面，我们为每个器械标注了少量像素作为关键像素。基于此，我们进一步提出了关键像素关联损失和关键像素分布损失，驱动点到掩码分支生成更准确的分割预测。为全面评估该任务，我们发布了一个带有手工标注的新颖手术器械数据集，为后续研究建立了基准。我们的综合研究试验验证了PWISeg的优越性能。结果表明，手术器械分割的准确性得到了提升，超越了大多数基于弱监督边界框的实例分割方法。这一改进在我们提出的数据集以及公开的HOSPI-Tools数据集中均得到了一致验证。