The paper introduces the Decouple Re-identificatiOn and human Parsing (DROP) method for occluded person re-identification (ReID). Unlike mainstream approaches using global features for simultaneous multi-task learning of ReID and human parsing, or relying on semantic information for attention guidance, DROP argues that the inferior performance of the former is due to distinct granularity requirements for ReID and human parsing features. ReID focuses on instance part-level differences between pedestrian parts, while human parsing centers on semantic spatial context, reflecting the internal structure of the human body. To address this, DROP decouples features for ReID and human parsing, proposing detail-preserving upsampling to combine varying resolution feature maps. Parsing-specific features for human parsing are decoupled, and human position information is exclusively added to the human parsing branch. In the ReID branch, a part-aware compactness loss is introduced to enhance instance-level part differences. Experimental results highlight the efficacy of DROP, especially achieving a Rank-1 accuracy of 76.8% on Occluded-Duke, surpassing two mainstream methods. The codebase is accessible at https://github.com/shuguang-52/DROP.
翻译:本文提出了解耦行人重识别与人体解析(Decouple Re-identification and human Parsing,DROP)方法,用于遮挡场景下的行人重识别(ReID)。与主流方法采用全局特征同时进行ReID与人体解析的多任务学习,或依赖语义信息引导注意力不同,DROP认为前者性能不佳的根源在于ReID与人体解析特征对粒度的需求存在本质差异:ReID聚焦于行人部件间的实例级细粒度差异,而人体解析则强调反映人体内部结构的语义空间上下文。为解决该问题,DROP将ReID与人体解析的特征进行解耦,提出保留细节的上采样方法以融合不同分辨率特征图。通过解耦获取人体解析专用特征,并将人体位置信息仅注入人体解析分支。在ReID分支中,引入部件感知紧致性损失以增强实例级部件差异。实验结果验证了DROP的有效性,特别是在Occluded-Duke数据集上达到76.8%的Rank-1准确率,超越两类主流方法。代码库已开源至https://github.com/shuguang-52/DROP。