This study introduces an efficacious approach, Masked Collaborative Contrast (MCC), to highlight semantic regions in weakly supervised semantic segmentation. MCC adroitly draws inspiration from masked image modeling and contrastive learning to devise a novel framework that induces keys to contract toward semantic regions. Unlike prevalent techniques that directly eradicate patch regions in the input image when generating masks, we scrutinize the neighborhood relations of patch tokens by exploring masks considering keys on the affinity matrix. Moreover, we generate positive and negative samples in contrastive learning by utilizing the masked local output and contrasting it with the global output. Elaborate experiments on commonly employed datasets evidences that the proposed MCC mechanism effectively aligns global and local perspectives within the image, attaining impressive performance. The source code is available at \url{https://github.com/fwu11/MCC}.
翻译:本研究提出一种有效方法——遮蔽式协同对比(MCC),用于弱监督语义分割中突显语义区域。MCC巧妙借鉴遮蔽图像建模和对比学习的思想,设计了一个新颖框架,促使键(keys)向语义区域收缩。与主流技术直接在输入图像中移除补丁区域生成掩码不同,我们通过探索基于亲和矩阵的掩码考量键(keys)的邻域关系。此外,我们利用遮蔽局部输出与全局输出进行对比,生成对比学习中的正负样本。在常用数据集上的详尽实验表明,所提出的MCC机制能有效对齐图像内的全局与局部视角,取得卓越性能。源代码已发布于\url{https://github.com/fwu11/MCC}。