Semi-supervised semantic segmentation has recently gained increasing research interest as it can reduce the requirement for large-scale fully-annotated training data by effectively exploiting large amounts of unlabelled data. The current methods often suffer from the confirmation bias from the pseudo-labelling process, which can be alleviated by the co-training framework. The current co-training-based semi-supervised semantic segmentation methods rely on hand-crafted perturbations to prevent the different sub-nets from collapsing into each other, but these artificial perturbations cannot lead to the optimal solution. In this work, we propose a new conflict-based cross-view consistency (CCVC) method based on a two-branch co-training framework for semi-supervised semantic segmentation. Our work aims at enforcing the two sub-nets to learn informative features from irrelevant views. In particular, we first propose a new cross-view consistency (CVC) strategy that encourages the two sub-nets to learn distinct features from the same input by introducing a feature discrepancy loss, while these distinct features are expected to generate consistent prediction scores of the input. The CVC strategy helps to prevent the two sub-nets from stepping into the collapse. In addition, we further propose a conflict-based pseudo-labelling (CPL) method to guarantee the model will learn more useful information from conflicting predictions, which will lead to a stable training process. We validate our new semi-supervised semantic segmentation approach on the widely used benchmark datasets PASCAL VOC 2012 and Cityscapes, where our method achieves new state-of-the-art performance.
翻译:半监督语义分割近期因能通过有效利用大量未标注数据来减少对大规模完全标注训练数据的依赖而日益受到研究关注。现有方法常受伪标注过程中的确认偏差困扰,而协同训练框架可缓解此问题。当前基于协同训练的半监督语义分割方法依赖手工设计的扰动来防止不同子网络相互趋同,但这些人工扰动无法产生最优解。本文提出一种基于双分支协同训练框架的冲突驱动的跨视图一致性(CCVC)方法,旨在强制两个子网络从无关视角学习信息性特征。具体而言,我们首先提出新的跨视图一致性(CVC)策略,通过引入特征差异性损失促使两个子网络从相同输入学习不同特征,同时期望这些不同特征生成一致的输入预测分数。CVC策略有助于防止两个子网络陷入趋同状态。此外,我们进一步提出基于冲突的伪标注(CPL)方法,确保模型从冲突预测中学习更多有用信息,从而稳定训练过程。在广泛使用的基准数据集PASCAL VOC 2012和Cityscapes上验证了该半监督语义分割方法,取得了新的最优性能。