SILOP: An Automated Framework for Semantic Segmentation Using Image Labels Based on Object Perimeters

Achieving high-quality semantic segmentation predictions using only image-level labels enables a new level of real-world applicability. Although state-of-the-art networks deliver reliable predictions, the amount of handcrafted pixel-wise annotations to enable these results are not feasible in many real-world applications. Hence, several works have already targeted this bottleneck, using classifier-based networks like Class Activation Maps~\cite{CAM} (CAMs) as a base. Addressing CAM's weaknesses of fuzzy borders and incomplete predictions, state-of-the-art approaches rely only on adding regulations to the classifier loss or using pixel-similarity-based refinement after the fact. We propose a framework that introduces an additional module using object perimeters for improved saliency. We define object perimeter information as the line separating the object and background. Our new PerimeterFit module will be applied to pre-refine the CAM predictions before using the pixel-similarity-based network. In this way, our PerimeterFit increases the quality of the CAM prediction while simultaneously improving the false negative rate. We investigated a wide range of state-of-the-art unsupervised semantic segmentation networks and edge detection techniques to create useful perimeter maps, which enable our framework to predict object locations with sharper perimeters. We achieved up to 1.5% improvement over frameworks without our PerimeterFit module. We conduct an exhaustive analysis to illustrate that SILOP enhances existing state-of-the-art frameworks for image-level-based semantic segmentation. The framework is open-source and accessible online at https://github.com/ErikOstrowski/SILOP.

翻译：仅使用图像级标签实现高质量的语义分割预测，为真实世界应用开辟了新途径。尽管最先进的网络能够生成可靠预测，但在许多实际应用中，获取所需的手工像素级标注数据并不现实。为此，已有研究聚焦于这一瓶颈，采用基于分类器的网络（如类别激活图~\cite{CAM}，CAM）作为基础。针对CAM存在的模糊边界与不完整预测缺陷，现有方法仅通过在分类器损失中添加约束，或事后采用基于像素相似性的精化策略。我们提出一种框架，通过引入基于目标轮廓的额外模块来提升显著性。我们将目标轮廓信息定义为分隔目标与背景的边界线。新提出的PerimeterFit模块将先应用于CAM预测的预精化阶段，再使用基于像素相似性的网络。通过这种方式，PerimeterFit在提升CAM预测质量的同时降低了假阴性率。我们广泛研究了多种最先进的无监督语义分割网络与边缘检测技术，以生成有效的轮廓图，从而赋予框架预测具有更清晰轮廓的目标位置的能力。相较于未集成PerimeterFit模块的框架，我们实现了最高1.5%的性能提升。通过详尽分析，我们证明SILOP可增强现有基于图像级标签的语义分割框架。该框架为开源项目，可通过https://github.com/ErikOstrowski/SILOP在线访问。