The key of visible-infrared person re-identification (VIReID) lies in how to minimize the modality discrepancy between visible and infrared images. Existing methods mainly exploit the spatial information while ignoring the discriminative frequency information. To address this issue, this paper aims to reduce the modality discrepancy from the frequency domain perspective. Specifically, we propose a novel Frequency Domain Nuances Mining (FDNM) method to explore the cross-modality frequency domain information, which mainly includes an amplitude guided phase (AGP) module and an amplitude nuances mining (ANM) module. These two modules are mutually beneficial to jointly explore frequency domain visible-infrared nuances, thereby effectively reducing the modality discrepancy in the frequency domain. Besides, we propose a center-guided nuances mining loss to encourage the ANM module to preserve discriminative identity information while discovering diverse cross-modality nuances. Extensive experiments show that the proposed FDNM has significant advantages in improving the performance of VIReID. Specifically, our method outperforms the second-best method by 5.2\% in Rank-1 accuracy and 5.8\% in mAP on the SYSU-MM01 dataset under the indoor search mode, respectively. Besides, we also validate the effectiveness and generalization of our method on the challenging visible-infrared face recognition task. \textcolor{magenta}{The code will be available.}
翻译:可见光-红外行人重识别(VIReID)的关键在于如何最小化可见光图像与红外图像之间的模态差异。现有方法主要利用空间信息而忽略了具有判别性的频率信息。为解决此问题,本文旨在从频域角度减少模态差异。具体而言,我们提出了一种新颖的频域细节挖掘(FDNM)方法,用于探索跨模态频域信息,该方法主要包括振幅引导相位(AGP)模块和振幅细节挖掘(ANM)模块。这两个模块相互协同,共同挖掘频域中可见光-红外的细节差异,从而有效减少频域内的模态差异。此外,我们还提出了一种中心引导的细节挖掘损失,以激励ANM模块在发现多样化跨模态细节的同时保留判别性身份信息。大量实验表明,所提出的FDNM方法在提升VIReID性能方面具有显著优势。具体地,在SYSU-MM01数据集室内搜索模式下,我们的方法在Rank-1准确率和mAP上分别比第二优方法高出5.2%和5.8%。此外,我们还在具有挑战性的可见光-红外人脸识别任务上验证了方法的有效性和泛化能力。代码将公开。