Despite great improvements in semantic segmentation, challenges persist because of the lack of local/global contexts and the relationship between them. In this paper, we propose Contextrast, a contrastive learning-based semantic segmentation method that allows to capture local/global contexts and comprehend their relationships. Our proposed method comprises two parts: a) contextual contrastive learning (CCL) and b) boundary-aware negative (BANE) sampling. Contextual contrastive learning obtains local/global context from multi-scale feature aggregation and inter/intra-relationship of features for better discrimination capabilities. Meanwhile, BANE sampling selects embedding features along the boundaries of incorrectly predicted regions to employ them as harder negative samples on our contrastive learning, resolving segmentation issues along the boundary region by exploiting fine-grained details. We demonstrate that our Contextrast substantially enhances the performance of semantic segmentation networks, outperforming state-of-the-art contrastive learning approaches on diverse public datasets, e.g. Cityscapes, CamVid, PASCAL-C, COCO-Stuff, and ADE20K, without an increase in computational cost during inference.
翻译:尽管语义分割取得了显著进步,但由于缺乏局部/全局上下文及其相互关系的建模,挑战依然存在。本文提出Contextrast——一种基于对比学习的语义分割方法,能够捕获局部/全局上下文并理解其关联。该方法包含两部分:(a) 上下文对比学习(CCL)和(b) 边界感知负样本(BANE)采样。上下文对比学习通过多尺度特征聚合与特征内部/间关系建模来获取局部/全局上下文,从而提升判别能力;同时,BANE采样选择错误预测区域边界的嵌入特征作为更难分辨的负样本用于对比学习,通过利用细粒度细节解决边界区域的分割问题。实验证明,Contextrast能显著提升语义分割网络的性能,在Cityscapes、CamVid、PASCAL-C、COCO-Stuff和ADE20K等多个公共数据集上超越当前最先进的对比学习方法,且推理阶段不增加计算成本。