In this paper, we introduce SemiRES, a semi-supervised framework that effectively leverages a combination of labeled and unlabeled data to perform RES. A significant hurdle in applying semi-supervised techniques to RES is the prevalence of noisy pseudo-labels, particularly at the boundaries of objects. SemiRES incorporates the Segment Anything Model (SAM), renowned for its precise boundary demarcation, to improve the accuracy of these pseudo-labels. Within SemiRES, we offer two alternative matching strategies: IoU-based Optimal Matching (IOM) and Composite Parts Integration (CPI). These strategies are designed to extract the most accurate masks from SAM's output, thus guiding the training of the student model with enhanced precision. In instances where a precise mask cannot be matched from the available candidates, we develop the Pixel-Wise Adjustment (PWA) strategy, guiding the student model's training directly by the pseudo-labels. Extensive experiments on three RES benchmarks--RefCOCO, RefCOCO+, and G-Ref reveal its superior performance compared to fully supervised methods. Remarkably, with only 1% labeled data, our SemiRES outperforms the supervised baseline by a large margin, e.g. +18.64% gains on RefCOCO val set. The project code is available at \url{https://github.com/nini0919/SemiRES}.
翻译:本文提出SemiRES,一种半监督框架,能有效结合标注与未标注数据执行指称表达分割。将半监督技术应用于指称表达分割的主要障碍在于伪标签普遍存在噪声,尤其在物体边界区域。SemiRES引入以精确边界划分著称的Segment Anything Model(SAM)来提升伪标签的准确性。在该框架中,我们提出两种匹配策略:基于交并比的最优匹配(IOM)与复合部件集成(CPI)。这些策略旨在从SAM的输出中提取最精确的掩码,从而以更高精度指导学生模型的训练。当无法从候选掩码中匹配到精确掩码时,我们开发了像素级调整(PWA)策略,直接通过伪标签指导学生模型的训练。在三个指称表达分割基准数据集——RefCOCO、RefCOCO+和G-Ref上的大量实验表明,本方法性能优于全监督方法。值得注意的是,仅使用1%标注数据时,我们的SemiRES大幅超越监督基线,例如在RefCOCO验证集上获得+18.64%的性能提升。项目代码发布于\url{https://github.com/nini0919/SemiRES}。