Surgical instrument segmentation (SIS) is pivotal for robotic-assisted minimally invasive surgery, assisting surgeons by identifying surgical instruments in endoscopic video frames. Recent unsupervised surgical instrument segmentation (USIS) methods primarily rely on pseudo-labels derived from low-level features such as color and optical flow, but these methods show limited effectiveness and generalizability in complex and unseen endoscopic scenarios. In this work, we propose a label-free unsupervised model featuring a novel module named Multi-View Normalized Cutter (m-NCutter). Different from previous USIS works, our model is trained using a graph-cutting loss function that leverages patch affinities for supervision, eliminating the need for pseudo-labels. The framework adaptively determines which affinities from which levels should be prioritized. Therefore, the low- and high-level features and their affinities are effectively integrated to train a label-free unsupervised model, showing superior effectiveness and generalization ability. We conduct comprehensive experiments across multiple SIS datasets to validate our approach's state-of-the-art (SOTA) performance, robustness, and exceptional potential as a pre-trained model. Our code is released at https://github.com/MingyuShengSMY/AMNCutter.
翻译:手术器械分割(SIS)对于机器人辅助微创手术至关重要,它通过在内窥镜视频帧中识别手术器械来辅助外科医生。近期的无监督手术器械分割(USIS)方法主要依赖于从颜色和光流等低级特征衍生的伪标签,但这些方法在复杂和未见过的内窥镜场景中表现出有限的有效性和泛化能力。在本工作中,我们提出了一种无标签的无监督模型,其核心是一个名为多视角归一化切割器(m-NCutter)的新模块。与以往的USIS工作不同,我们的模型使用一种图割损失函数进行训练,该函数利用图像块间的亲和力进行监督,从而消除了对伪标签的需求。该框架自适应地确定应从哪些层级优先考虑哪些亲和力。因此,低层级和高层级的特征及其亲和力被有效整合,以训练一个无标签的无监督模型,展现出卓越的有效性和泛化能力。我们在多个SIS数据集上进行了全面的实验,以验证我们方法的最先进(SOTA)性能、鲁棒性以及作为预训练模型的卓越潜力。我们的代码发布于 https://github.com/MingyuShengSMY/AMNCutter。