Currently, existing efforts in Weakly Supervised Semantic Segmentation (WSSS) based on Convolutional Neural Networks (CNNs) have predominantly focused on enhancing the multi-label classification network stage, with limited attention given to the equally important downstream segmentation network. Furthermore, CNN-based local convolutions lack the ability to model the extensive inter-category dependencies. Therefore, this paper introduces a graph reasoning-based approach to enhance WSSS. The aim is to improve WSSS holistically by simultaneously enhancing both the multi-label classification and segmentation network stages. In the multi-label classification network segment, external knowledge is integrated, coupled with GCNs, to globally reason about inter-class dependencies. This encourages the network to uncover features in non-salient regions of images, thereby refining the completeness of generated pseudo-labels. In the segmentation network segment, the proposed Graph Reasoning Mapping (GRM) module is employed to leverage knowledge obtained from textual databases, facilitating contextual reasoning for class representation within image regions. This GRM module enhances feature representation in high-level semantics of the segmentation network's local convolutions, while dynamically learning semantic coherence for individual samples. Using solely image-level supervision, we have achieved state-of-the-art performance in WSSS on the PASCAL VOC 2012 and MS-COCO datasets. Extensive experimentation on both the multi-label classification and segmentation network stages underscores the effectiveness of the proposed graph reasoning approach for advancing WSSS.
翻译:当前,基于卷积神经网络的弱监督语义分割(WSSS)研究主要集中于提升多标签分类网络阶段,而对同等重要的下游分割网络关注不足。此外,基于CNN的局部卷积难以建模广泛的跨类别依赖关系。为此,本文提出一种基于图推理的方法来增强WSSS,旨在通过同时改进多标签分类和分割网络阶段,全面提升WSSS性能。在多标签分类网络部分,引入外部知识并结合图卷积网络,对类别间的依赖关系进行全局推理,从而促使网络发掘图像非显著区域的特征,优化生成的伪标签的完整性。在分割网络部分,采用所提出的图推理映射模块,利用文本数据库获取的知识,促进图像区域内类别表示的上下文推理。该GRM模块增强了分割网络局部卷积在高层次语义上的特征表示,同时为每个样本动态学习语义连贯性。仅利用图像级监督,我们在PASCAL VOC 2012和MS-COCO数据集上实现了WSSS领域的最新性能。在多标签分类和分割网络阶段的大量实验证明了所提图推理方法对推进WSSS的有效性。