TOP-ReID: Multi-spectral Object Re-Identification with Token Permutation

Multi-spectral object Re-identification (ReID) aims to retrieve specific objects by leveraging complementary information from different image spectra. It delivers great advantages over traditional single-spectral ReID in complex visual environment. However, the significant distribution gap among different image spectra poses great challenges for effective multi-spectral feature representations. In addition, most of current Transformer-based ReID methods only utilize the global feature of class tokens to achieve the holistic retrieval, ignoring the local discriminative ones. To address the above issues, we step further to utilize all the tokens of Transformers and propose a cyclic token permutation framework for multi-spectral object ReID, dubbled TOP-ReID. More specifically, we first deploy a multi-stream deep network based on vision Transformers to preserve distinct information from different image spectra. Then, we propose a Token Permutation Module (TPM) for cyclic multi-spectral feature aggregation. It not only facilitates the spatial feature alignment across different image spectra, but also allows the class token of each spectrum to perceive the local details of other spectra. Meanwhile, we propose a Complementary Reconstruction Module (CRM), which introduces dense token-level reconstruction constraints to reduce the distribution gap across different image spectra. With the above modules, our proposed framework can generate more discriminative multi-spectral features for robust object ReID. Extensive experiments on three ReID benchmarks (i.e., RGBNT201, RGBNT100 and MSVR310) verify the effectiveness of our methods. The code is available at https://github.com/924973292/TOP-ReID.

翻译：多光谱目标重识别旨在通过利用不同光谱图像的互补信息来检索特定目标。相较于传统单光谱ReID，它在复杂视觉环境中具有显著优势。然而，不同光谱图像之间显著的分布差异给有效的多光谱特征表示带来了巨大挑战。此外，当前大多数基于Transformer的ReID方法仅利用类别令牌的全局特征实现整体检索，忽略了局部判别性特征。为解决上述问题，我们进一步利用Transformer的所有令牌，并提出一种循环令牌排列框架用于多光谱目标重识别，称为TOP-ReID。具体而言，我们首先部署基于视觉Transformer的多流深度网络以保留不同光谱图像的独特信息。随后，提出令牌排列模块用于循环多光谱特征聚合，该模块不仅促进不同光谱图像间的空间特征对齐，还允许各光谱的类别令牌感知其他光谱的局部细节。同时，我们提出互补重建模块，通过引入密集的令牌级重建约束来缩小不同光谱图像间的分布差异。借助上述模块，所提框架能够生成更具判别性的多光谱特征，从而实现鲁棒的目标重识别。在三个ReID基准数据集（RGBNT201、RGBNT100和MSVR310）上的大量实验验证了该方法的有效性。代码已开源至https://github.com/924973292/TOP-ReID。