Weakly supervised instance segmentation (WSIS) using only image-level labels is a challenging task due to the difficulty of aligning coarse annotations with the finer task. However, with the advancement of deep neural networks (DNNs), WSIS has garnered significant attention. Following a proposal-based paradigm, we encounter a redundant segmentation problem resulting from a single instance being represented by multiple proposals. For example, we feed a picture of a dog and proposals into the network and expect to output only one proposal containing a dog, but the network outputs multiple proposals. To address this problem, we propose a novel approach for WSIS that focuses on the online refinement of complete instances through the use of MaskIoU heads to predict the integrity scores of proposals and a Complete Instances Mining (CIM) strategy to explicitly model the redundant segmentation problem and generate refined pseudo labels. Our approach allows the network to become aware of multiple instances and complete instances, and we further improve its robustness through the incorporation of an Anti-noise strategy. Empirical evaluations on the PASCAL VOC 2012 and MS COCO datasets demonstrate that our method achieves state-of-the-art performance with a notable margin. Our implementation will be made available at https://github.com/ZechengLi19/CIM.
翻译:弱监督实例分割(WSIS)仅利用图像级标签是一项具有挑战性的任务,原因在于粗粒度标注与细粒度任务难以对齐。然而,随着深度神经网络(DNNs)的发展,WSIS已获得广泛关注。遵循基于候选区域的范式,我们面临由单个实例被多个候选区域表示导致的冗余分割问题。例如,当我们向网络输入一张狗的图片及其候选区域时,期望仅输出一个包含狗的候选区域,但网络却输出多个候选区域。为解决该问题,我们提出一种针对WSIS的新方法,该方法通过使用MaskIoU头预测候选区域的完整性分数,以及一种显式建模冗余分割问题并生成精细化伪标签的完整实例挖掘(CIM)策略,专注于在线完善完整实例。我们的方法使网络能够感知多个实例和完整实例,并通过引入抗噪声策略进一步提升其鲁棒性。在PASCAL VOC 2012和MS COCO数据集上的实验评估表明,我们的方法以显著优势实现了最先进的性能。我们的实现代码将在https://github.com/ZechengLi19/CIM 公开。