Visible-infrared person re-identification (VI-ReID) aims to search the same pedestrian of interest across visible and infrared modalities. Existing models mainly focus on compensating for modality-specific information to reduce modality variation. However, these methods often lead to a higher computational overhead and may introduce interfering information when generating the corresponding images or features. To address this issue, it is critical to leverage pedestrian-attentive features and learn modality-complete and -consistent representation. In this paper, a novel Transferring Modality-Aware Pedestrian Attentive Learning (TMPA) model is proposed, focusing on the pedestrian regions to efficiently compensate for missing modality-specific features. Specifically, we propose a region-based data augmentation module PedMix to enhance pedestrian region coherence by mixing the corresponding regions from different modalities. A lightweight hybrid compensation module, i.e., the Modality Feature Transfer (MFT), is devised to integrate cross attention and convolution networks to fully explore the discriminative modality-complete features with minimal computational overhead. Extensive experiments conducted on the benchmark SYSU-MM01 and RegDB datasets demonstrated the effectiveness of our proposed TMPA model.
翻译:可见-红外行人重识别(VI-ReID)旨在跨可见光和红外模态搜索同一目标行人。现有模型主要侧重于补偿模态特定信息以减小模态差异。然而,这些方法往往导致较高的计算开销,且在生成相应图像或特征时可能引入干扰信息。为解决该问题,关键在于利用行人注意力特征并学习模态完整且一致的表示。本文提出一种新颖的面向模态感知的行人注意力学习迁移(TMPA)模型,聚焦于行人区域以高效补偿缺失的模态特定特征。具体而言,我们提出基于区域的数据增强模块PedMix,通过混合不同模态的对应区域来增强行人区域一致性。设计轻量级混合补偿模块——模态特征迁移(MFT),集成交叉注意力与卷积网络,以最小计算开销充分探索具有判别性的模态完整特征。在基准数据集SYSU-MM01和RegDB上开展的大量实验表明,所提出的TMPA模型具有有效性与优越性。