Semi-Supervised Instance Segmentation (SSIS) aims to leverage an amount of unlabeled data during training. Previous frameworks primarily utilized the RGB information of unlabeled images to generate pseudo-labels. However, such a mechanism often introduces unstable noise, as a single instance can display multiple RGB values. To overcome this limitation, we introduce a Depth-Guided (DG) SSIS framework. This framework uses depth maps extracted from input images, which represent individual instances with closely associated distance values, offering precise contours for distinct instances. Unlike RGB data, depth maps provide a unique perspective, making their integration into the SSIS process complex. To this end, we propose Depth Feature Fusion, which integrates features extracted from depth estimation. This integration allows the model to understand depth information better and ensure its effective utilization. Additionally, to manage the variability of depth images during training, we introduce the Depth Controller. This component enables adaptive adjustments of the depth map, enhancing convergence speed and dynamically balancing the loss weights between RGB and depth maps. Extensive experiments conducted on the COCO and Cityscapes datasets validate the efficacy of our proposed method. Our approach establishes a new benchmark for SSIS, outperforming previous methods. Specifically, our DG achieves 22.29%, 31.47%, and 35.14% mAP for 1%, 5%, and 10% labeled data on the COCO dataset, respectively.
翻译:半监督实例分割(SSIS)旨在训练过程中利用大量未标注数据。先前框架主要利用未标注图像的RGB信息生成伪标签。然而,这种机制常引入不稳定噪声,因为单个实例可能呈现多种RGB值。为克服此局限,我们提出深度引导(DG)SSIS框架。该框架使用从输入图像提取的深度图——其通过紧密关联的距离值表示独立实例,为不同实例提供精确轮廓。与RGB数据不同,深度图提供独特视角,使其融入SSIS过程变得复杂。为此,我们提出深度特征融合机制,整合从深度估计提取的特征。该融合使模型能更好理解深度信息并确保其有效利用。此外,为处理训练过程中深度图像的变异性,我们引入深度控制器。该组件能自适应调整深度图,提升收敛速度并动态平衡RGB图与深度图间的损失权重。在COCO和Cityscapes数据集上的大量实验验证了所提方法的有效性。我们的方法为SSIS建立了新基准,性能超越先前方法。具体而言,在COCO数据集上使用1%、5%和10%标注数据时,我们的DG框架分别达到22.29%、31.47%和35.14%的mAP指标。