End-to-end person search aims to jointly detect and re-identify a target person in raw scene images with a unified model. The detection task unifies all persons while the re-id task discriminates different identities, resulting in conflict optimal objectives. Existing works proposed to decouple end-to-end person search to alleviate such conflict. Yet these methods are still sub-optimal on one or two of the sub-tasks due to their partially decoupled models, which limits the overall person search performance. In this paper, we propose to fully decouple person search towards optimal person search. A task-incremental person search network is proposed to incrementally construct an end-to-end model for the detection and re-id sub-task, which decouples the model architecture for the two sub-tasks. The proposed task-incremental network allows task-incremental training for the two conflicting tasks. This enables independent learning for different objectives thus fully decoupled the model for persons earch. Comprehensive experimental evaluations demonstrate the effectiveness of the proposed fully decoupled models for end-to-end person search.
翻译:端到端行人搜索旨在利用统一模型在原始场景图像中联合检测和重识别目标行人。检测任务统一所有行人,而重识别任务区分不同身份,导致优化目标相互冲突。现有工作提出解耦端到端行人搜索以缓解此类冲突。然而,这些方法因其部分解耦模型而在一到两个子任务上仍非最优,限制了整体行人搜索性能。本文提出对行人搜索进行完全解耦以实现最优性能。我们提出一种任务增量式行人搜索网络,通过增量构建用于检测和重识别的子任务端到端模型,从而将两个子任务的模型架构解耦。所提出的任务增量式网络支持对两个冲突任务进行增量式训练。这使得不同目标能够独立学习,从而实现行人搜索模型的完全解耦。全面的实验评估证明了所提出的完全解耦模型在端到端行人搜索中的有效性。