In this work, we construct a large-scale dataset for Ground-to-Aerial Person Search, named G2APS, which contains 31,770 images of 260,559 annotated bounding boxes for 2,644 identities appearing in both of the UAVs and ground surveillance cameras. To our knowledge, this is the first dataset for cross-platform intelligent surveillance applications, where the UAVs could work as a powerful complement for the ground surveillance cameras. To more realistically simulate the actual cross-platform Ground-to-Aerial surveillance scenarios, the surveillance cameras are fixed about 2 meters above the ground, while the UAVs capture videos of persons at different location, with a variety of view-angles, flight attitudes and flight modes. Therefore, the dataset has the following unique characteristics: 1) drastic view-angle changes between query and gallery person images from cross-platform cameras; 2) diverse resolutions, poses and views of the person images under 9 rich real-world scenarios. On basis of the G2APS benchmark dataset, we demonstrate detailed analysis about current two-step and end-to-end person search methods, and further propose a simple yet effective knowledge distillation scheme on the head of the ReID network, which achieves state-of-the-art performances on both of the G2APS and the previous two public person search datasets, i.e., PRW and CUHK-SYSU. The dataset and source code available on \url{https://github.com/yqc123456/HKD_for_person_search}.
翻译:本文构建了一个面向地面到空中行人搜索的大规模数据集G2APS,包含31,770张图像、260,559个标注边界框,涵盖同时出现在无人机与地面监控摄像头中的2,644个身份。据我们所知,这是首个面向跨平台智能监控应用的数据集,其中无人机可作为地面监控摄像头的强效补充。为更真实地模拟实际跨平台地面到空中监控场景,监控摄像头固定于距地面约2米处,而无人机在不同位置、以多种视角、飞行姿态和飞行模式捕获行人影像。因此,该数据集具有以下独特特征:1)跨平台摄像头的查询与库行人图像之间存在剧烈视角变化;2)在9种丰富真实场景下,行人图像呈现多样化的分辨率、姿态与视角。基于G2APS基准数据集,我们对当前两步法与端到端行人搜索方法进行了详细分析,并进一步在行人重识别网络头部提出了一种简单有效的知识蒸馏方案,在G2APS及先前两个公开行人搜索数据集(即PRW和CUHK-SYSU)上均取得了最优性能。数据集与源代码已发布于\url{https://github.com/yqc123456/HKD_for_person_search}。