Weakly Supervised Semantic Segmentation (WSSS), which leverages image-level labels, has garnered significant attention due to its cost-effectiveness. The previous methods mainly strengthen the inter-class differences to avoid class semantic ambiguity which may lead to erroneous activation. However, they overlook the positive function of some shared information between similar classes. Categories within the same cluster share some similar features. Allowing the model to recognize these features can further relieve the semantic ambiguity between these classes. To effectively identify and utilize this shared information, in this paper, we introduce a novel WSSS framework called Prompt Categories Clustering (PCC). Specifically, we explore the ability of Large Language Models (LLMs) to derive category clusters through prompts. These clusters effectively represent the intrinsic relationships between categories. By integrating this relational information into the training network, our model is able to better learn the hidden connections between categories. Experimental results demonstrate the effectiveness of our approach, showing its ability to enhance performance on the PASCAL VOC 2012 dataset and surpass existing state-of-the-art methods in WSSS.
翻译:弱监督语义分割(WSSS)利用图像级标签,因其成本效益而受到广泛关注。先前的方法主要通过增强类间差异来避免可能导致错误激活的类别语义模糊性。然而,它们忽视了相似类别间某些共享信息的积极作用。同一聚类内的类别共享一些相似特征。允许模型识别这些特征可以进一步缓解这些类别间的语义模糊性。为了有效识别和利用此类共享信息,本文提出了一种名为提示类别聚类(PCC)的新型WSSS框架。具体而言,我们探索了大型语言模型(LLMs)通过提示推导类别聚类的潜力。这些聚类能有效表征类别间的内在关联。通过将此类关系信息整合到训练网络中,我们的模型能够更好地学习类别间的隐含联系。实验结果表明了本方法的有效性,其在PASCAL VOC 2012数据集上展现出性能提升,并超越了现有弱监督语义分割领域的先进方法。