Visible-Infrared Person Re-identification (VI-ReID) is a challenging cross-modal pedestrian retrieval task, due to significant intra-class variations and cross-modal discrepancies among different cameras. Existing works mainly focus on embedding images of different modalities into a unified space to mine modality-shared features. They only seek distinctive information within these shared features, while ignoring the identity-aware useful information that is implicit in the modality-specific features. To address this issue, we propose a novel Implicit Discriminative Knowledge Learning (IDKL) network to uncover and leverage the implicit discriminative information contained within the modality-specific. First, we extract modality-specific and modality-shared features using a novel dual-stream network. Then, the modality-specific features undergo purification to reduce their modality style discrepancies while preserving identity-aware discriminative knowledge. Subsequently, this kind of implicit knowledge is distilled into the modality-shared feature to enhance its distinctiveness. Finally, an alignment loss is proposed to minimize modality discrepancy on enhanced modality-shared features. Extensive experiments on multiple public datasets demonstrate the superiority of IDKL network over the state-of-the-art methods. Code is available at https://github.com/1KK077/IDKL.
翻译:可见-红外人再识别(VI-ReID)是一项具有挑战性的跨模态行人检索任务,主要由于不同摄像机间显著的类内差异和跨模态差异。现有工作主要致力于将不同模态的图像嵌入到统一空间中,以挖掘模态共享特征。它们仅在这些共享特征中寻求判别性信息,而忽略了隐含在模态特定特征中的身份感知有用信息。为解决此问题,我们提出了一种新颖的隐式判别知识学习(IDKL)网络,以发现并利用模态特定特征中包含的隐式判别信息。首先,我们利用一种新颖的双流网络提取模态特定和模态共享特征。然后,对模态特定特征进行净化,以减少其模态风格差异,同时保留身份感知的判别知识。随后,将这类隐式知识蒸馏到模态共享特征中,以增强其区分性。最后,提出一种对齐损失,以最小化增强后模态共享特征上的模态差异。在多个公开数据集上的大量实验表明,IDKL网络优于当前最先进方法。代码可在 https://github.com/1KK077/IDKL 获取。