As an important task in intelligent transportation systems, Aerial-Ground person Re-IDentification (AG-ReID) aims to retrieve specific persons across heterogeneous cameras in different viewpoints. Previous methods typically adopt deep learning-based models, focusing on extracting view-invariant features. However, they usually overlook the semantic information in person attributes. In addition, existing training strategies often rely on full fine-tuning large-scale models, which significantly increases training costs. To address these issues, we propose a novel framework named LATex for AG-ReID, which adopts prompt-tuning strategies to leverage attribute-based text knowledge. Specifically, with the Contrastive Language-Image Pre-training (CLIP) model, we first propose an Attribute-aware Image Encoder (AIE) to extract both global semantic features and attribute-aware features from input images. Then, with these features, we propose a Prompted Attribute Classifier Group (PACG) to predict person attributes and obtain attribute representations. Finally, we design a Coupled Prompt Template (CPT) to transform attribute representations and view information into structured sentences. These sentences are processed by the text encoder of CLIP to generate more discriminative features. As a result, our framework can fully leverage attribute-based text knowledge to improve AG-ReID performance. Extensive experiments on three AG-ReID benchmarks demonstrate the effectiveness of our proposed methods. The source code is available at https://github.com/kevinhu314/LATex.
翻译:作为智能交通系统中的一项重要任务,空地行人重识别旨在跨不同视角的异构摄像头检索特定行人。先前方法通常采用基于深度学习的模型,侧重于提取视角不变特征。然而,这些方法往往忽视了行人属性中的语义信息。此外,现有训练策略通常依赖于对大规模模型进行全参数微调,这显著增加了训练成本。为解决这些问题,我们提出了一种名为LATex的新型空地行人重识别框架,该框架采用提示调优策略来利用基于属性的文本知识。具体而言,基于对比语言-图像预训练模型,我们首先提出了一种属性感知图像编码器,用于从输入图像中提取全局语义特征和属性感知特征。随后,利用这些特征,我们提出了提示化属性分类器组来预测行人属性并获取属性表示。最后,我们设计了一种耦合提示模板,将属性表示和视角信息转化为结构化语句。这些语句通过CLIP的文本编码器处理,以生成更具判别性的特征。因此,我们的框架能够充分利用基于属性的文本知识来提升空地行人重识别性能。在三个空地行人重识别基准数据集上的大量实验证明了所提方法的有效性。源代码发布于https://github.com/kevinhu314/LATex。