An increasing number of datasets sharing similar domains for semantic segmentation have been published over the past few years. But despite the growing amount of overall data, it is still difficult to train bigger and better models due to inconsistency in taxonomy and/or labeling policies of different datasets. To this end, we propose a knowledge distillation approach that also serves as a label space unification method for semantic segmentation. In short, a teacher model is trained on a source dataset with a given taxonomy, then used to pseudo-label additional data for which ground truth labels of a related label space exist. By mapping the related taxonomies to the source taxonomy, we create constraints within which the model can predict pseudo-labels. Using the improved pseudo-labels we train student models that consistently outperform their teachers in two challenging domains, namely urban and off-road driving. Our ground truth-corrected pseudo-labels span over 12 and 7 public datasets with 388.230 and 18.558 images for the urban and off-road domains, respectively, creating the largest compound datasets for autonomous driving to date.
翻译:近年来,语义分割领域涌现出越来越多具有相似域的数据集。尽管总体数据量持续增长,但由于不同数据集在分类体系或标注策略上存在不一致性,训练更大、更优的模型仍面临挑战。为此,我们提出一种知识蒸馏方法,该方法同时可作为语义分割的标签空间统一方案。简而言之,教师模型首先在具有特定分类体系的源数据集上训练,随后用于对额外数据进行伪标注——这些数据本身存在相关标签空间的真实标注。通过将相关分类体系映射至源分类体系,我们构建了模型预测伪标签的约束框架。利用优化后的伪标签,我们训练出的学生模型在两个极具挑战性的领域(城市驾驶与越野驾驶)中持续超越其教师模型。经真实标注校正的伪标签分别覆盖12个和7个公共数据集,包含388,230张城市场景图像与18,558张越野场景图像,构建了迄今为止自动驾驶领域规模最大的复合数据集。