Location determination finds wide applications in daily life. Instead of existing efforts devoted to localizing tourist photos captured by perspective cameras, in this article, we focus on devising person positioning solutions using overhead fisheye cameras. Such solutions are advantageous in large field of view (FOV), low cost, anti-occlusion, and unaggressive work mode (without the necessity of cameras carried by persons). However, related studies are quite scarce, due to the paucity of data. To stimulate research in this exciting area, we present LOAF, the first large-scale overhead fisheye dataset for person detection and localization. LOAF is built with many essential features, e.g., i) the data cover abundant diversities in scenes, human pose, density, and location; ii) it contains currently the largest number of annotated pedestrian, i.e., 457K bounding boxes with groundtruth location information; iii) the body-boxes are labeled as radius-aligned so as to fully address the positioning challenge. To approach localization, we build a fisheye person detection network, which exploits the fisheye distortions by a rotation-equivariant training strategy and predict radius-aligned human boxes end-to-end. Then, the actual locations of the detected persons are calculated by a numerical solution on the fisheye model and camera altitude data. Extensive experiments on LOAF validate the superiority of our fisheye detector w.r.t. previous methods, and show that our whole fisheye positioning solution is able to locate all persons in FOV with an accuracy of 0.5 m, within 0.1 s.
翻译:位置确定在日常生活中具有广泛应用。与现有致力于利用透视摄像头拍摄的旅游照片进行定位的研究不同,本文聚焦于利用顶置鱼眼摄像头设计人员定位方案。此类方案具有大视场角、低成本、抗遮挡及非侵入式工作模式(无需人员携带摄像头)等优势。然而,由于数据稀缺,相关研究较为匮乏。为促进该领域的研究,我们提出首个用于人体检测与定位的大规模顶视角鱼眼数据集LOAF。LOAF具备多项关键特性:i)数据涵盖场景、人体姿态、密度及位置的丰富多样性;ii)包含当前最大规模的带标注行人数据——457K个携带真实位置信息的边界框;iii)人体框采用半径对齐标注以充分解决定位挑战。为实现定位功能,我们构建了鱼眼人体检测网络,通过旋转等变训练策略利用鱼眼畸变特性,端到端预测半径对齐的人体框。随后,基于鱼眼模型与摄像头高度数据,通过数值求解方法计算检测目标的实际位置。在LOAF上的大量实验表明,相较于已有方法,我们的鱼眼检测器具有显著优越性,且整套鱼眼定位方案能在0.1秒内以0.5米精度定位视场中所有人员。