Class activation mapping~(CAM), a visualization technique for interpreting deep learning models, is now commonly used for weakly supervised semantic segmentation~(WSSS) and object localization~(WSOL). It is the weighted aggregation of the feature maps by activating the high class-relevance ones. Current CAM methods achieve it relying on the training outcomes, such as predicted scores~(forward information), gradients~(backward information), etc. However, when with small-scale data, unstable training may lead to less effective model outcomes and generate unreliable weights, finally resulting in incorrect activation and noisy CAM seeds. In this paper, we propose an outcome-agnostic CAM approach, called BroadCAM, for small-scale weakly supervised applications. Since broad learning system (BLS) is independent to the model learning, BroadCAM can avoid the weights being affected by the unreliable model outcomes when with small-scale data. By evaluating BroadCAM on VOC2012 (natural images) and BCSS-WSSS (medical images) for WSSS and OpenImages30k for WSOL, BroadCAM demonstrates superior performance than existing CAM methods with small-scale data (less than 5\%) in different CNN architectures. It also achieves SOTA performance with large-scale training data. Extensive qualitative comparisons are conducted to demonstrate how BroadCAM activates the high class-relevance feature maps and generates reliable CAMs when with small-scale training data.
翻译:中文摘要:类激活映射(Class Activation Mapping, CAM)作为一种用于解释深度学习模型的可视化技术,目前已广泛应用于弱监督语义分割(WSSS)和弱监督目标定位(WSOL)。该方法通过激活高类别相关性特征图,实现特征图的加权聚合。现有CAM方法依赖训练结果(如预测分数等前向信息、梯度等反向信息)来生成权重。然而,在小样本数据场景下,不稳定的训练可能导致模型输出效果欠佳,从而产生不可靠的权重,最终引发错误激活与噪声CAM种子。本文提出一种名为BroadCAM的结果无关类激活映射方法,专用于小样本弱监督应用场景。由于宽度学习系统(BLS)与模型学习过程相互独立,BroadCAM能够避免小样本数据下不可靠模型输出对权重的影响。通过在VOC2012(自然图像)和BCSS-WSSS(医学图像)数据集上进行WSSS任务评估,以及在OpenImages30k数据集上进行WSOL任务评估,实验结果表明:在不同CNN架构下,BroadCAM利用不足5%的小样本数据即可展现出优于现有CAM方法的性能;同时,在大规模训练数据场景下亦能达到最优性能(SOTA)。此外,本文通过大量定性对比实验,揭示了BroadCAM如何在小样本训练数据条件下准确激活高类别相关性特征图并生成可靠CAM映射。