In fine-grained road scene understanding, semantic segmentation plays a crucial role in enabling vehicles to perceive and comprehend their surroundings. By assigning a specific class label to each pixel in an image, it allows for precise identification and localization of detailed road features, which is vital for high-quality scene understanding and downstream perception tasks. A key challenge in this domain lies in improving the recognition performance of minority classes while mitigating the dominance of majority classes, which is essential for achieving balanced and robust overall performance. However, traditional semi-supervised learning methods often train models overlooking the imbalance between classes. To address this issue, firstly, we propose a general training module that learns from all the pseudo-labels without a conventional filtering strategy. Secondly, we propose a professional training module to learn specifically from reliable minority-class pseudo-labels identified by a novel mismatch score metric. The two modules are crossly supervised by each other so that it reduces model coupling which is essential for semi-supervised learning. During contrastive learning, to avoid the dominance of the majority classes in the feature space, we propose a strategy to assign evenly distributed anchors for different classes in the feature space. Experimental results on multiple public benchmarks show that our method surpasses traditional approaches in recognizing tail classes.
翻译:在细粒度道路场景理解中,语义分割对于使车辆感知和理解其周围环境起着至关重要的作用。通过对图像中的每个像素分配特定的类别标签,该方法能够精确识别和定位详细的道路特征,这对于高质量的场景理解和下游感知任务至关重要。该领域的一个关键挑战在于提高少数类别的识别性能,同时缓解多数类别的主导地位,这对于实现均衡且鲁棒的整体性能至关重要。然而,传统的半监督学习方法在训练模型时常常忽视类别间的不平衡问题。为解决此问题,首先,我们提出了一个通用训练模块,该模块学习所有伪标签,而不采用传统的过滤策略。其次,我们提出一个专业训练模块,专门学习由新颖的不匹配分数度量所识别的可靠少数类伪标签。这两个模块通过交叉监督相互促进,从而减少了模型耦合,这对半监督学习至关重要。在对比学习过程中,为避免多数类别在特征空间中的主导地位,我们提出了一种策略,为特征空间中不同类别分配均匀分布的锚点。在多个公开基准测试上的实验结果表明,我们的方法在识别尾部类别方面超越了传统方法。