In this paper, we introduce a novel Convolution-based Probability Gradient (CPG) loss for semantic segmentation. It employs convolution kernels similar to the Sobel operator, capable of computing the gradient of pixel intensity in an image. This enables the computation of gradients for both ground-truth and predicted category-wise probabilities. It enhances network performance by maximizing the similarity between these two probability gradients. Moreover, to specifically enhance accuracy near the object's boundary, we extract the object boundary based on the ground-truth probability gradient and exclusively apply the CPG loss to pixels belonging to boundaries. CPG loss proves to be highly convenient and effective. It establishes pixel relationships through convolution, calculating errors from a distinct dimension compared to pixel-wise loss functions such as cross-entropy loss. We conduct qualitative and quantitative analyses to evaluate the impact of the CPG loss on three well-established networks (DeepLabv3-Resnet50, HRNetV2-OCR, and LRASPP_MobileNet_V3_Large) across three standard segmentation datasets (Cityscapes, COCO-Stuff, ADE20K). Our extensive experimental results consistently and significantly demonstrate that the CPG loss enhances the mean Intersection over Union.
翻译:本文提出了一种新颖的基于卷积的概率梯度(CPG)损失用于语义分割。该方法采用与Sobel算子相似的卷积核,能够计算图像中像素强度的梯度,从而实现对真实标签和预测的类别概率梯度的计算。通过最大化这两个概率梯度之间的相似性,提升了网络性能。此外,为专门增强物体边界附近的精度,我们基于真实标签的概率梯度提取物体边界,并仅对属于边界的像素应用CPG损失。CPG损失被证明极为便捷且有效。它通过卷积建立像素关系,从与交叉熵损失等逐像素损失函数不同的维度计算误差。我们通过定性和定量分析,评估了CPG损失对三个经典网络(DeepLabv3-Resnet50、HRNetV2-OCR和LRASPP_MobileNet_V3_Large)在三个标准分割数据集(Cityscapes、COCO-Stuff、ADE20K)上的影响。大量实验结果表明,CPG损失能持续且显著地提升平均交并比。