Pre-training is a strong strategy for enhancing visual models to efficiently train them with a limited number of labeled images. In semantic segmentation, creating annotation masks requires an intensive amount of labor and time, and therefore, a large-scale pre-training dataset with semantic labels is quite difficult to construct. Moreover, what matters in semantic segmentation pre-training has not been fully investigated. In this paper, we propose the Segmentation Radial Contour DataBase (SegRCDB), which for the first time applies formula-driven supervised learning for semantic segmentation. SegRCDB enables pre-training for semantic segmentation without real images or any manual semantic labels. SegRCDB is based on insights about what is important in pre-training for semantic segmentation and allows efficient pre-training. Pre-training with SegRCDB achieved higher mIoU than the pre-training with COCO-Stuff for fine-tuning on ADE-20k and Cityscapes with the same number of training images. SegRCDB has a high potential to contribute to semantic segmentation pre-training and investigation by enabling the creation of large datasets without manual annotation. The SegRCDB dataset will be released under a license that allows research and commercial use. Code is available at: https://github.com/dahlian00/SegRCDB
翻译:预训练是一种增强视觉模型、使其能通过少量标注图像进行高效训练的强有力策略。在语义分割中,创建标注掩码需要大量人力和时间,因此构建大规模带有语义标签的预训练数据集极为困难。此外,语义分割预训练中的关键因素尚未得到充分研究。本文提出分割径向轮廓数据库(SegRCDB),首次将公式驱动监督学习应用于语义分割。SegRCDB 无需真实图像或任何人工语义标签即可实现语义分割预训练。它基于对语义分割预训练中重要因素的洞察,支持高效预训练。使用 SegRCDB 进行预训练后,在 ADE-20k 和 Cityscapes 数据集上微调时,其平均交并比(mIoU)高于使用相同数量训练图像的 COCO-Stuff 预训练结果。SegRCDB 通过无需人工标注即可创建大型数据集,具有推动语义分割预训练及研究的巨大潜力。该数据集将在允许研究和商业使用的许可下发布。代码详见:https://github.com/dahlian00/SegRCDB