Person Re-Identification (Re-ID) task seeks to enhance the tracking of multiple individuals by surveillance cameras. It provides additional support for multimodal tasks, including text-based person retrieval and human matching. Among the significant challenges faced in Re-ID, one of the most prominent is dealing with clothes-changing, where the same person may appear in different outfits. While previous methods have made notable progress in maintaining clothing data consistency and handling clothing change data, they still tend to rely excessively on clothing information, which can limit performance due to the dynamic nature of human appearances. To mitigate this challenge, we propose the Pose-Guided Supervision (PGS), an effective framework for learning pose guidance within the Re-ID task. Our PGS consists of three modules: a human encoder, a pose encoder, and a Pose-to-Human Projection module (PHP). The pose encoder module utilizes a frozen pre-trained model while we fine-tune a pre-trained human-centric model for the human encoder module. Our PHP transfers pose knowledge from the pose encoder module to the human encoder module through multiple projectors. Our framework, following extensive experimentation on five benchmark datasets, consistently surpasses the performance of current state-of-the-art methods. Our code is available at https://github.com/huyquoctrinh/PGS.
翻译:行人重识别任务旨在提升多摄像头对多个体的追踪能力,并为文本行人检索、人体匹配等多模态任务提供额外支撑。在该任务面临的诸多挑战中,最具代表性的问题之一是处理换装场景——同一人物可能穿着不同服饰。尽管现有方法在保持服装数据一致性与处理换装数据方面取得了显著进展,但由于人体外观的动态变化特性,这些方法仍过度依赖服装信息,从而制约性能表现。为缓解该挑战,我们提出姿态引导监督(PGS)框架,通过有效学习姿态引导信息来优化行人重识别任务。该框架包含三个模块:人体编码器、姿态编码器及姿态-人体投影模块。其中姿态编码器采用冻结的预训练模型,而人体编码器通过微调面向人体的预训练模型得到。投影模块通过多个投影头将姿态编码器的姿态知识迁移至人体编码器。在五个基准数据集上的大量实验表明,我们的方法持续超越现有最优方法。代码已开源至 https://github.com/huyquoctrinh/PGS。