Visible-Infrared Person Re-identification (VI-ReID) is a challenging cross-modal pedestrian retrieval task, due to significant intra-class variations and cross-modal discrepancies among different cameras. Existing works mainly focus on embedding images of different modalities into a unified space to mine modality-shared features. They only seek distinctive information within these shared features, while ignoring the identity-aware useful information that is implicit in the modality-specific features. To address this issue, we propose a novel Implicit Discriminative Knowledge Learning (IDKL) network to uncover and leverage the implicit discriminative information contained within the modality-specific. First, we extract modality-specific and modality-shared features using a novel dual-stream network. Then, the modality-specific features undergo purification to reduce their modality style discrepancies while preserving identity-aware discriminative knowledge. Subsequently, this kind of implicit knowledge is distilled into the modality-shared feature to enhance its distinctiveness. Finally, an alignment loss is proposed to minimize modality discrepancy on enhanced modality-shared features. Extensive experiments on multiple public datasets demonstrate the superiority of IDKL network over the state-of-the-art methods. Code is available at https://github.com/1KK077/IDKL.
翻译:可见光-红外行人重识别(VI-ReID)是一项具有挑战性的跨模态行人检索任务,其主要难点在于不同摄像头之间存在的类内差异与跨模态差异。现有方法主要致力于将不同模态的图像嵌入统一空间以挖掘模态共享特征,但仅在这些共享特征中寻找判别性信息,忽视了模态特定特征中隐含的身份感知有用信息。针对该问题,我们提出了一种新颖的隐式判别知识学习(IDKL)网络,旨在发现并利用模态特定特征中包含的隐式判别信息。首先,通过新型双流网络提取模态特定特征与模态共享特征;其次,对模态特定特征进行净化处理,在保留身份感知判别知识的同时减少其模态风格差异;随后,将这类隐式知识蒸馏至模态共享特征以增强其判别性;最后,提出对齐损失函数以最小化增强后模态共享特征的模态差异。在多个公开数据集上的大量实验证明,IDKL网络的性能优于当前最先进方法。代码已开源至https://github.com/1KK077/IDKL。