Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets

In this work, we introduce Scribbles for All, a label and training data generation algorithm for semantic segmentation trained on scribble labels. Training or fine-tuning semantic segmentation models with weak supervision has become an important topic recently and was subject to significant advances in model quality. In this setting, scribbles are a promising label type to achieve high quality segmentation results while requiring a much lower annotation effort than usual pixel-wise dense semantic segmentation annotations. The main limitation of scribbles as source for weak supervision is the lack of challenging datasets for scribble segmentation, which hinders the development of novel methods and conclusive evaluations. To overcome this limitation, Scribbles for All provides scribble labels for several popular segmentation datasets and provides an algorithm to automatically generate scribble labels for any dataset with dense annotations, paving the way for new insights and model advancements in the field of weakly supervised segmentation. In addition to providing datasets and algorithm, we evaluate state-of-the-art segmentation models on our datasets and show that models trained with our synthetic labels perform competitively with respect to models trained on manual labels. Thus, our datasets enable state-of-the-art research into methods for scribble-labeled semantic segmentation. The datasets, scribble generation algorithm, and baselines are publicly available at https://github.com/wbkit/Scribbles4All

翻译：本研究提出Scribbles for All——一种基于涂鸦标注训练语义分割模型的标签与训练数据生成算法。近年来，利用弱监督训练或微调语义分割模型已成为重要研究方向，并在模型质量方面取得显著进展。在此背景下，涂鸦作为一种标注形式，能以远低于常规像素级密集语义标注的标注成本获得高质量分割结果。涂鸦作为弱监督源的主要局限在于缺乏具有挑战性的涂鸦分割数据集，这阻碍了新方法的开发与结论性评估。为突破此限制，Scribbles for All为多个主流分割数据集提供涂鸦标注，并提出可自动为任何具有密集标注的数据集生成涂鸦标签的算法，为弱监督分割领域的新发现与模型进展铺平道路。除提供数据集与算法外，我们在自建数据集上评估了前沿分割模型，结果表明使用合成标签训练的模型与人工标注训练的模型相比具有竞争力。因此，本数据集能够支持基于涂鸦标注的语义分割方法的前沿研究。数据集、涂鸦生成算法及基线模型已在https://github.com/wbkit/Scribbles4All 公开。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日