Cross-modal hashing (CMH) has appeared as a popular technique for cross-modal retrieval due to its low storage cost and high computational efficiency in large-scale data. Most existing methods implicitly assume that multi-modal data is correctly labeled, which is expensive and even unattainable due to the inevitable imperfect annotations (i.e., noisy labels) in real-world scenarios. Inspired by human cognitive learning, a few methods introduce self-paced learning (SPL) to gradually train the model from easy to hard samples, which is often used to mitigate the effects of feature noise or outliers. It is a less-touched problem that how to utilize SPL to alleviate the misleading of noisy labels on the hash model. To tackle this problem, we propose a new cognitive cross-modal retrieval method called Robust Self-paced Hashing with Noisy Labels (RSHNL), which can mimic the human cognitive process to identify the noise while embracing robustness against noisy labels. Specifically, we first propose a contrastive hashing learning (CHL) scheme to improve multi-modal consistency, thereby reducing the inherent semantic gap. Afterward, we propose center aggregation learning (CAL) to mitigate the intra-class variations. Finally, we propose Noise-tolerance Self-paced Hashing (NSH) that dynamically estimates the learning difficulty for each instance and distinguishes noisy labels through the difficulty level. For all estimated clean pairs, we further adopt a self-paced regularizer to gradually learn hash codes from easy to hard. Extensive experiments demonstrate that the proposed RSHNL performs remarkably well over the state-of-the-art CMH methods.
翻译:跨模态哈希(CMH)因其在大规模数据中存储成本低、计算效率高而成为跨模态检索的常用技术。现有方法大多隐含假设多模态数据标注正确,然而现实场景中不可避免存在标注缺陷(即含噪标签),导致获取完全准确标注的成本高昂甚至不可实现。受人类认知学习启发,部分方法引入自步学习(SPL)以从易到难逐步训练模型,常用于缓解特征噪声或异常值的影响。如何利用SPL减轻噪声标签对哈希模型的误导,仍是一个较少被探索的问题。针对此问题,我们提出一种名为"含噪标签鲁棒自步哈希"(RSHNL)的新型认知跨模态检索方法,该方法能模拟人类认知过程识别噪声,同时对含噪标签保持鲁棒性。具体而言,我们首先提出对比哈希学习(CHL)方案以增强多模态一致性,从而减小固有语义鸿沟;随后提出中心聚合学习(CAL)以缓解类内差异;最后提出噪声容忍自步哈希(NSH),动态评估每个实例的学习难度,并通过难度级别区分含噪标签。对于所有估计的干净样本对,我们进一步采用自步正则化器实现从易到难的渐进式哈希编码学习。大量实验表明,所提出的RSHNL方法在跨模态哈希任务上显著优于当前最先进方法。