Multimodal LLM-Empowered Re-Ranking for Generalizable Person Re-Identification

Domain Generalizable (DG) person re-identification (Re-ID) has attracted growing research interest due to its potential for deployment in unseen real-world scenarios. Most existing approaches address DG Re-ID by focusing on training domain-generalizable encoders but ignore the possible refinements in inference stage. In contrast, this work explores an alternative direction which improves inference re-ranking to enhance DG Re-ID. Conventional re-ranking methods typically rely on neighborhood-based distances to refine the initial ranking list, inherently depending on features produced by the Re-ID encoder. However, they deteriorate on target domains since the encoder lacks sufficient generalizability to produce reliable feature distances on unseen scenarios. Inspired by the remarkable generalization capabilities of recent Multimodal Large Language Models (MLLMs), we propose an MLLM-empowered distance metric to improve re-ranking in DG Re-ID. Specifically, we first adapt an MLLM to Re-ID data through supervised fine-tuning, which incorporates a domain-agnostic prompt and a query-candidate hard mining scheme. Then, the adapted MLLM is employed to compute a $μ$-distance during inference, which is robust to domain gap and significantly enhances subsequent re-ranking performance. Our approach is model-agnostic and can be seamlessly integrated into previous re-ranking frameworks. Extensive experiments demonstrate that our approach consistently yields substantial performance improvements across multiple DG Re-ID benchmarks. The code of this work will be released at https://github.com/RikoLi/MUSE soon.

翻译：领域泛化（Domain Generalizable, DG）行人再识别（Re-ID）因其在未见过的真实场景中的部署潜力而受到越来越多的研究关注。现有方法大多通过训练领域泛化编码器来处理DG Re-ID问题，但忽略了推理阶段可能存在的优化空间。与此相反，本文探索了一种通过改进推理阶段重排序来增强DG Re-ID的替代方向。传统重排序方法通常依赖基于邻域的距离来优化初始排序列表，本质上依赖于Re-ID编码器生成的特征。然而，由于编码器缺乏足够的泛化能力来在未见场景中生成可靠的特征距离，这些方法在目标域上性能退化。受近期多模态大语言模型（MLLMs）卓越泛化能力的启发，我们提出一种基于MLLM的距离度量来改进DG Re-ID中的重排序。具体而言，我们首先通过监督微调将MLLM适配到Re-ID数据，该方法融合了领域无关提示和查询-候选难例挖掘方案。随后，使用适配后的MLLM在推理阶段计算一种对领域差异具有鲁棒性的μ-距离，从而显著提升后续重排序性能。我们的方法具有模型无关性，可无缝集成到现有重排序框架中。大量实验表明，该方法在多个DG Re-ID基准上均能持续带来显著的性能提升。本工作代码将在https://github.com/RikoLi/MUSE 开源。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

无监督行人重识别研究综述

专知会员服务

18+阅读 · 2025年8月3日

「面向复杂场景的行人重识别综述」最新2022研究进展综述

专知会员服务

38+阅读 · 2022年11月3日

【经典课程】《基于深度学习和行人重识别》，附课件与视频

专知会员服务

24+阅读 · 2022年9月24日

【AAAI2022】通过自适应聚类关系建模的无监督行人再识别

专知会员服务

18+阅读 · 2021年12月8日