In recent years, several Weakly Supervised Semantic Segmentation (WS3) methods have been proposed that use class activation maps (CAMs) generated by a classifier to produce pseudo-ground truths for training segmentation models. While CAMs are good at highlighting discriminative regions (DR) of an image, they are known to disregard regions of the object that do not contribute to the classifier's prediction, termed non-discriminative regions (NDR). In contrast, attribution methods such as saliency maps provide an alternative approach for assigning a score to every pixel based on its contribution to the classification prediction. This paper provides a comprehensive comparison between saliencies and CAMs for WS3. Our study includes multiple perspectives on understanding their similarities and dissimilarities. Moreover, we provide new evaluation metrics that perform a comprehensive assessment of WS3 performance of alternative methods w.r.t. CAMs. We demonstrate the effectiveness of saliencies in addressing the limitation of CAMs through our empirical studies on benchmark datasets. Furthermore, we propose random cropping as a stochastic aggregation technique that improves the performance of saliency, making it a strong alternative to CAM for WS3.
翻译:近年来,研究者提出了多种弱监督语义分割方法,这些方法利用分类器生成的类激活图产生伪真实标签,用于训练分割模型。尽管类激活图在突出图像判别区域方面表现优异,但它们常忽略那些对分类器预测无贡献的目标区域,即非判别区域。相比之下,显著性图等归因方法通过评估每个像素对分类预测的贡献度,提供了另一种评分途径。本文对弱监督语义分割中的显著性图与类激活图进行了全面比较,从多个视角探讨其相似性与差异性。此外,我们提出了新的评估指标,可全面评估替代方法相对于类激活图在弱监督语义分割中的性能。通过在基准数据集上的实证研究,我们证明了显著性图在克服类激活图局限性方面的有效性。进一步,我们提出随机裁剪作为随机聚合技术,可有效提升显著性图的性能,使其成为弱监督语义分割中类激活图的强有力替代方案。