The scarcity of labeled data in real-world scenarios is a critical bottleneck of deep learning's effectiveness. Semi-supervised semantic segmentation has been a typical solution to achieve a desirable tradeoff between annotation cost and segmentation performance. However, previous approaches, whether based on consistency regularization or self-training, tend to neglect the contextual knowledge embedded within inter-pixel relations. This negligence leads to suboptimal performance and limited generalization. In this paper, we propose a novel approach IPixMatch designed to mine the neglected but valuable Inter-Pixel information for semi-supervised learning. Specifically, IPixMatch is constructed as an extension of the standard teacher-student network, incorporating additional loss terms to capture inter-pixel relations. It shines in low-data regimes by efficiently leveraging the limited labeled data and extracting maximum utility from the available unlabeled data. Furthermore, IPixMatch can be integrated seamlessly into most teacher-student frameworks without the need of model modification or adding additional components. Our straightforward IPixMatch method demonstrates consistent performance improvements across various benchmark datasets under different partitioning protocols.
翻译:真实场景中标注数据的稀缺性是深度学习有效性的关键瓶颈。半监督语义分割已成为在标注成本与分割性能之间实现理想权衡的典型解决方案。然而,以往基于一致性正则化或自训练的方法往往忽略了像素间关系所蕴含的上下文知识,导致性能次优且泛化能力有限。本文提出了一种新方法IPixMatch,旨在挖掘半监督学习中被忽视但具有价值的像素间信息。具体而言,IPixMatch作为标准教师-学生网络的扩展构建,通过引入额外损失项来捕获像素间关系。该方法在低数据场景下表现优异,能够高效利用有限标注数据,并从可用无标注数据中提取最大效用。此外,IPixMatch可无缝集成至大多数教师-学生框架中,无需修改模型或添加额外组件。我们简洁的IPixMatch方法在多种基准数据集及不同划分协议下均展现出持续的性能提升。