Unsupervised visible-infrared person re-identification (USL-VI-ReID) aims to match pedestrian images of the same identity from different modalities without annotations. Existing works mainly focus on alleviating the modality gap by aligning instance-level features of the unlabeled samples. However, the relationships between cross-modality clusters are not well explored. To this end, we propose a novel bilateral cluster matching-based learning framework to reduce the modality gap by matching cross-modality clusters. Specifically, we design a Many-to-many Bilateral Cross-Modality Cluster Matching (MBCCM) algorithm through optimizing the maximum matching problem in a bipartite graph. Then, the matched pairwise clusters utilize shared visible and infrared pseudo-labels during the model training. Under such a supervisory signal, a Modality-Specific and Modality-Agnostic (MSMA) contrastive learning framework is proposed to align features jointly at a cluster-level. Meanwhile, the cross-modality Consistency Constraint (CC) is proposed to explicitly reduce the large modality discrepancy. Extensive experiments on the public SYSU-MM01 and RegDB datasets demonstrate the effectiveness of the proposed method, surpassing state-of-the-art approaches by a large margin of 8.76% mAP on average.
翻译:无监督可见光-红外行人重识别(USL-VI-ReID)旨在无需标注条件下,匹配来自不同模态的同一身份行人图像。现有研究主要通过对齐未标注样本的实例级特征来缓解模态差异,但跨模态聚类之间的关系尚未得到充分探索。为此,我们提出一种新颖的双边聚类匹配学习框架,通过匹配跨模态聚类来缩小模态差距。具体而言,我们设计了一种多对多双边跨模态聚类匹配(MBCCM)算法,通过优化二分图中的最大匹配问题实现匹配。随后,在模型训练过程中,匹配的成对聚类共享可见光和红外伪标签。在此监督信号下,我们进一步提出模态特定与模态无关(MSMA)对比学习框架,在聚类层级实现特征联合对齐。同时,引入跨模态一致性约束(CC)以显式减小模态差异。在公开数据集SYSU-MM01和RegDB上的大量实验表明,所提方法效果显著,平均mAP超越现有最优方法达8.76%。