Training and validating models for semantic segmentation require datasets with pixel-wise annotations, which are notoriously labor-intensive. Although useful priors such as foundation models or crowdsourced datasets are available, they are error-prone. We hence propose an effective framework of active label correction (ALC) based on a design of correction query to rectify pseudo labels of pixels, which in turn is more annotator-friendly than the standard one inquiring to classify a pixel directly according to our theoretical analysis and user study. Specifically, leveraging foundation models providing useful zero-shot predictions on pseudo labels and superpixels, our method comprises two key techniques: (i) an annotator-friendly design of correction query with the pseudo labels, and (ii) an acquisition function looking ahead label expansions based on the superpixels. Experimental results on PASCAL, Cityscapes, and Kvasir-SEG datasets demonstrate the effectiveness of our ALC framework, outperforming prior methods for active semantic segmentation and label correction. Notably, utilizing our method, we obtained a revised dataset of PASCAL by rectifying errors in 2.6 million pixels in PASCAL dataset.
翻译:训练和验证语义分割模型需要像素级标注的数据集,而众所周知,这类标注极其耗费人力。尽管存在基础模型或众包数据集等有用的先验信息,但它们容易出错。因此,我们提出了一种有效的主动标签校正框架,其核心是设计一种校正查询来修正像素的伪标签。根据我们的理论分析和用户研究,这种校正查询比直接要求标注者对像素进行分类的标准查询方式对标注者更为友好。具体而言,我们的方法利用基础模型为伪标签和超像素提供有用的零样本预测,包含两项关键技术:基于伪标签的、对标注者友好的校正查询设计;以及基于超像素、前瞻性考虑标签扩展的获取函数。在PASCAL、Cityscapes和Kvasir-SEG数据集上的实验结果证明了我们主动标签校正框架的有效性,其性能优于先前的主动语义分割和标签校正方法。值得注意的是,利用我们的方法,我们通过修正PASCAL数据集中260万个像素的错误,获得了一个修订版的PASCAL数据集。