Extracting image semantics effectively and assigning corresponding labels to multiple objects or attributes for natural images is challenging due to the complex scene contents and confusing label dependencies. Recent works have focused on modeling label relationships with graph and understanding object regions using class activation maps (CAM). However, these methods ignore the complex intra- and inter-category relationships among specific semantic features, and CAM is prone to generate noisy information. To this end, we propose a novel semantic-aware dual contrastive learning framework that incorporates sample-to-sample contrastive learning (SSCL) as well as prototype-to-sample contrastive learning (PSCL). Specifically, we leverage semantic-aware representation learning to extract category-related local discriminative features and construct category prototypes. Then based on SSCL, label-level visual representations of the same category are aggregated together, and features belonging to distinct categories are separated. Meanwhile, we construct a novel PSCL module to narrow the distance between positive samples and category prototypes and push negative samples away from the corresponding category prototypes. Finally, the discriminative label-level features related to the image content are accurately captured by the joint training of the above three parts. Experiments on five challenging large-scale public datasets demonstrate that our proposed method is effective and outperforms the state-of-the-art methods. Code and supplementary materials are released on https://github.com/yu-gi-oh-leilei/SADCL.
翻译:有效提取图像语义并为自然图像中的多个对象或属性分配相应标签,因复杂场景内容与令人困惑的标签依赖关系而具有挑战性。近期研究侧重于利用图模型建模标签关系,并通过类激活映射(CAM)理解对象区域。然而,这些方法忽略了特定语义特征在类别内部与类别之间的复杂关系,且CAM易产生噪声信息。为此,我们提出一种新颖的语义感知双对比学习框架,该框架融合了样本间对比学习(SSCL)与原型-样本对比学习(PSCL)。具体而言,我们利用语义感知表示学习提取与类别相关的局部判别性特征,并构建类别原型。随后基于SSCL,将同一类别的标签级视觉表征聚合在一起,同时分离不同类别所属的特征。此外,我们构建了新型PSCL模块,用于缩短正样本与类别原型间的距离,并推动负样本远离对应类别原型。最终,通过上述三部分的联合训练,我们能够准确捕获与图像内容相关的判别性标签级特征。在五个具有挑战性的大规模公开数据集上的实验表明,本文提出的方法有效且优于现有最先进方法。代码与补充材料发布于https://github.com/yu-gi-oh-leilei/SADCL。