Multimodal person re-identification (Re-ID) aims to match pedestrian images across different modalities. However, most existing methods focus on limited cross-modal settings and fail to support arbitrary query-retrieval combinations, hindering practical deployment. We propose FlexiReID, a flexible framework that supports seven retrieval modes across four modalities: rgb, infrared, sketches, and text. FlexiReID introduces an adaptive mixture-of-experts (MoE) mechanism to dynamically integrate diverse modality features and a cross-modal query fusion module to enhance multimodal feature extraction. To facilitate comprehensive evaluation, we construct CIRS-PEDES, a unified dataset extending four popular Re-ID datasets to include all four modalities. Extensive experiments demonstrate that FlexiReID achieves state-of-the-art performance and offers strong generalization in complex scenarios.
翻译:多模态行人重识别旨在跨不同模态匹配行人图像。然而,现有方法大多局限于有限的跨模态设定,无法支持任意的查询-检索组合,限制了实际部署。本文提出FlexiReID——一个支持RGB、红外、素描与文本四种模态间七种检索模式的灵活框架。该框架引入自适应专家混合机制以动态整合多模态特征,并设计跨模态查询融合模块以增强多模态特征提取能力。为进行全面评估,我们构建了CIRS-PEDES统一数据集,通过扩展四个主流Re-ID数据集涵盖全部四种模态。大量实验表明,FlexiReID在复杂场景中不仅达到最先进的性能,同时展现出强大的泛化能力。