MDReID: Modality-Decoupled Learning for Any-to-Any Multi-Modal Object Re-Identification

Real-world object re-identification (ReID) systems often face modality inconsistencies, where query and gallery images come from different sensors (e.g., RGB, NIR, TIR). However, most existing methods assume modality-matched conditions, which limits their robustness and scalability in practical applications. To address this challenge, we propose MDReID, a flexible any-to-any image-level ReID framework designed to operate under both modality-matched and modality-mismatched scenarios. MDReID builds on the insight that modality information can be decomposed into two components: modality-shared features that are predictable and transferable, and modality-specific features that capture unique, modality-dependent characteristics. To effectively leverage this, MDReID introduces two key components: the Modality Decoupling Learning (MDL) and Modality-aware Metric Learning (MML). Specifically, MDL explicitly decomposes modality features into modality-shared and modality-specific representations, enabling effective retrieval in both modality-aligned and mismatched scenarios. MML, a tailored metric learning strategy, further enforces orthogonality and complementarity between the two components to enhance discriminative power across modalities. Extensive experiments conducted on three challenging multi-modality ReID benchmarks (RGBNT201, RGBNT100, MSVR310) consistently demonstrate the superiority of MDReID. Notably, MDReID achieves significant mAP improvements of 9.8\%, 3.0\%, and 11.5\% in general modality-matched scenarios, and average gains of 3.4\%, 11.8\%, and 10.9\% in modality-mismatched scenarios, respectively. The code is available at: \textcolor{magenta}{https://github.com/stone96123/MDReID}.

翻译：现实世界中的目标重识别系统常面临模态不一致问题，即查询图像与库图像可能来自不同传感器（如RGB、近红外、热红外）。然而，现有方法大多假设模态匹配条件，这限制了其在实际应用中的鲁棒性与可扩展性。为应对这一挑战，我们提出MDReID——一种灵活的任意到任意图像级重识别框架，旨在同时适用于模态匹配与模态失配场景。MDReID基于以下洞见：模态信息可解耦为两个组成部分：可预测且可迁移的模态共享特征，以及捕获独特模态依赖特性的模态特定特征。为有效利用该特性，MDReID引入两个核心模块：模态解耦学习与模态感知度量学习。具体而言，模态解耦学习显式地将模态特征分解为模态共享与模态特定表示，从而在模态对齐和失配场景中均实现有效检索。模态感知度量学习作为一种定制化的度量学习策略，通过增强两个分量的正交性与互补性，进一步提升跨模态判别能力。在三个具有挑战性的多模态重识别基准数据集上的大量实验一致证明了MDReID的优越性。值得注意的是，在通用模态匹配场景中，MDReID分别实现了9.8%、3.0%和11.5%的mAP显著提升；在模态失配场景中平均获得3.4%、11.8%和10.9%的性能增益。代码已开源：\textcolor{magenta}{https://github.com/stone96123/MDReID}。