Weakly-supervised semantic segmentation (WSSS) performs pixel-wise classification given only image-level labels for training. Despite the difficulty of this task, the research community has achieved promising results over the last five years. Still, current WSSS literature misses the detailed sense of how well the methods perform on different sizes of objects. Thus we propose a novel evaluation metric to provide a comprehensive assessment across different object sizes and collect a size-balanced evaluation set to complement PASCAL VOC. With these two gadgets, we reveal that the existing WSSS methods struggle in capturing small objects. Furthermore, we propose a size-balanced cross-entropy loss coupled with a proper training strategy. It generally improves existing WSSS methods as validated upon ten baselines on three different datasets.
翻译:弱监督语义分割(WSSS)在仅依赖图像级别标签进行训练的情况下,执行逐像素分类任务。尽管该任务难度较大,但研究界在过去五年中已取得了令人瞩目的成果。然而,当前WSSS文献缺乏对方法在不同尺寸物体上性能的详细分析。为此,我们提出了一种新型评估指标,以全面评估不同尺寸物体的分割效果,并收集了一个尺寸均衡的评估集,用以补充PASCAL VOC数据集。借助这两种工具,我们揭示了现有WSSS方法在捕捉小目标方面存在困难。此外,我们提出了一种尺寸平衡的交叉熵损失函数,并结合了适当的训练策略。该方法在三个不同数据集的十个基准模型上得到了验证,显著提升了现有WSSS方法的性能。