Achieving high-quality semantic segmentation predictions using only image-level labels enables a new level of real-world applicability. Although state-of-the-art networks deliver reliable predictions, the amount of handcrafted pixel-wise annotations to enable these results are not feasible in many real-world applications. Hence, several works have already targeted this bottleneck, using classifier-based networks like Class Activation Maps (CAMs) as a base. Addressing CAM's weaknesses of fuzzy borders and incomplete predictions, state-of-the-art approaches rely only on adding regulations to the classifier loss or using pixel-similarity-based refinement after the fact. We propose a framework that introduces an additional module using object perimeters for improved saliency. We define object perimeter information as the line separating the object and background. Our new PerimeterFit module will be applied to pre-refine the CAM predictions before using the pixel-similarity-based network. In this way, our PerimeterFit increases the quality of the CAM prediction while simultaneously improving the false negative rate. We investigated a wide range of state-of-the-art unsupervised semantic segmentation networks and edge detection techniques to create useful perimeter maps, which enable our framework to predict object locations with sharper perimeters. We achieved up to 1.5\% improvement over frameworks without our PerimeterFit module. We conduct an exhaustive analysis to illustrate that our framework enhances existing state-of-the-art frameworks for image-level-based semantic segmentation. The framework is open-source and accessible online at https://github.com/ErikOstrowski/Perimeter-based-Semantic-Segmentation.
翻译:仅使用图像级标签实现高质量的语义分割预测,为实际应用提供了新的可能性。尽管最先进的网络能够提供可靠的预测结果,但实现这些成果所需的大量手工像素级标注在许多实际场景中并不可行。因此,已有研究针对这一瓶颈展开工作,将基于分类器的网络(如类激活映射)作为基础。为了解决CAM在模糊边界和不完整预测方面的缺陷,现有方法仅依赖在分类器损失函数中添加约束条件,或事后使用基于像素相似性的细化技术。我们提出了一种框架,通过引入利用目标边界信息的新模块来提升显著性。我们将目标边界信息定义为区分目标与背景的分割线。所提出的PerimeterFit模块将在使用基于像素相似性的网络之前,对CAM预测结果进行预细化。通过这种方式,PerimeterFit模块在提升CAM预测质量的同时,有效改善了假阴性率。我们系统评估了多种最先进的无监督语义分割网络和边缘检测技术,以生成有效的边界图,从而帮助我们的框架预测具有更清晰边界的目标位置。相比未集成PerimeterFit模块的框架,我们的方法性能提升高达1.5%。通过详尽分析表明,该框架能够增强基于图像级标签的现有最先进语义分割方案。本框架为开源项目,可通过https://github.com/ErikOstrowski/Perimeter-based-Semantic-Segmentation 在线访问。