Currently, existing efforts in Weakly Supervised Semantic Segmentation (WSSS) based on Convolutional Neural Networks (CNNs) have predominantly focused on enhancing the multi-label classification network stage, with limited attention given to the equally important downstream segmentation network. Furthermore, CNN-based local convolutions lack the ability to model the extensive inter-category dependencies. Therefore, this paper introduces a graph reasoning-based approach to enhance WSSS. The aim is to improve WSSS holistically by simultaneously enhancing both the multi-label classification and segmentation network stages. In the multi-label classification network segment, external knowledge is integrated, coupled with GCNs, to globally reason about inter-class dependencies. This encourages the network to uncover features in non-salient regions of images, thereby refining the completeness of generated pseudo-labels. In the segmentation network segment, the proposed Graph Reasoning Mapping (GRM) module is employed to leverage knowledge obtained from textual databases, facilitating contextual reasoning for class representation within image regions. This GRM module enhances feature representation in high-level semantics of the segmentation network's local convolutions, while dynamically learning semantic coherence for individual samples. Using solely image-level supervision, we have achieved state-of-the-art performance in WSSS on the PASCAL VOC 2012 and MS-COCO datasets. Extensive experimentation on both the multi-label classification and segmentation network stages underscores the effectiveness of the proposed graph reasoning approach for advancing WSSS.
翻译:当前,基于卷积神经网络的弱监督语义分割(WSSS)研究主要致力于增强多标签分类网络阶段,而对同等重要的下游分割网络关注有限。此外,基于CNN的局部卷积缺乏建模大规模类别间依赖关系的能力。为此,本文引入了一种基于图推理的方法来提升WSSS性能,旨在通过同时增强多标签分类网络和分割网络两个阶段,实现WSSS的整体优化。在多标签分类网络部分,通过整合外部知识并结合图卷积网络(GCN),对类别间依赖关系进行全局推理,从而促使网络挖掘图像非显著区域的特征,提升生成伪标签的完整性。在分割网络阶段,利用所提出的图推理映射(GRM)模块,借助从文本数据库获取的知识,促进图像区域内类别表示的上下文推理。该GRM模块能够增强分割网络局部卷积中高层语义的特征表示,同时动态学习单个样本的语义一致性。仅基于图像级监督,我们在PASCAL VOC 2012和MS-COCO数据集上实现了WSSS的最新性能。在多标签分类网络和分割网络两个阶段的大量实验,充分验证了所提出的图推理方法对推进WSSS研究的有效性。