Tremendous breakthroughs have been developed in Semi-Supervised Semantic Segmentation (S4) through contrastive learning. However, due to limited annotations, the guidance on unlabeled images is generated by the model itself, which inevitably exists noise and disturbs the unsupervised training process. To address this issue, we propose a robust contrastive-based S4 framework, termed the Probabilistic Representation Contrastive Learning (PRCL) framework to enhance the robustness of the unsupervised training process. We model the pixel-wise representation as Probabilistic Representations (PR) via multivariate Gaussian distribution and tune the contribution of the ambiguous representations to tolerate the risk of inaccurate guidance in contrastive learning. Furthermore, we introduce Global Distribution Prototypes (GDP) by gathering all PRs throughout the whole training process. Since the GDP contains the information of all representations with the same class, it is robust from the instant noise in representations and bears the intra-class variance of representations. In addition, we generate Virtual Negatives (VNs) based on GDP to involve the contrastive learning process. Extensive experiments on two public benchmarks demonstrate the superiority of our PRCL framework.
翻译:摘要:通过对比学习,半监督语义分割(S4)领域已取得重大突破。然而,由于标注数据有限,对无标签图像的引导信息由模型自身生成,这不可避免地存在噪声并干扰无监督训练过程。为解决这一问题,我们提出一种鲁棒的基于对比学习的S4框架——概率表示对比学习(PRCL)框架,以增强无监督训练过程的鲁棒性。我们通过多元高斯分布将像素级表示建模为概率表示(PR),并调整模糊表示的贡献度,以容忍对比学习中不准确引导带来的风险。此外,我们通过收集整个训练过程中的所有PR引入全局分布原型(GDP)。由于GDP包含同一类别所有表示的信息,它能抵御表示中的即时噪声并承受表示的类内方差。基于GDP,我们还生成了虚拟负样本(VNs)以参与对比学习过程。在两个公开基准上的大量实验证明了PRCL框架的优越性。