With the prosperity of the video surveillance, multiple cameras have been applied to accurately locate pedestrians in a specific area. However, previous methods rely on the human-labeled annotations in every video frame and camera view, leading to heavier burden than necessary camera calibration and synchronization. Therefore, we propose in this paper an Unsupervised Multi-view Pedestrian Detection approach (UMPD) to eliminate the need of annotations to learn a multi-view pedestrian detector via 2D-3D mapping. 1) Firstly, Semantic-aware Iterative Segmentation (SIS) is proposed to extract unsupervised representations of multi-view images, which are converted into 2D pedestrian masks as pseudo labels, via our proposed iterative PCA and zero-shot semantic classes from vision-language models. 2) Secondly, we propose Geometry-aware Volume-based Detector (GVD) to end-to-end encode multi-view 2D images into a 3D volume to predict voxel-wise density and color via 2D-to-3D geometric projection, trained by 3D-to-2D rendering losses with SIS pseudo labels. 3) Thirdly, for better detection results, i.e., the 3D density projected on Birds-Eye-View from GVD, we propose Vertical-aware BEV Regularization (VBR) to constraint them to be vertical like the natural pedestrian poses. Extensive experiments on popular multi-view pedestrian detection benchmarks Wildtrack, Terrace, and MultiviewX, show that our proposed UMPD approach, as the first fully-unsupervised method to our best knowledge, performs competitively to the previous state-of-the-art supervised techniques. Code will be available.
翻译:随着视频监控的蓬勃发展,多摄像头被广泛应用于特定区域内的行人精确定位。然而,现有方法依赖每个视频帧和摄像头视角的人工标注,导致比必要的摄像头标定与同步更沉重的负担。为此,本文提出了一种无监督多视角行人检测方法(UMPD),通过2D-3D映射学习多视角行人检测器,消除对标注的需求。1)首先,提出语义感知迭代分割方法(SIS),通过迭代主成分分析(PCA)和来自视觉-语言模型的零样本语义类,将多视角图像转换为无监督表征,并生成二维行人掩码作为伪标签。2)其次,提出几何感知体素检测器(GVD),将多视角二维图像端到端编码为三维体素,通过二维到三维几何投影预测体素级的密度和颜色,并利用SIS伪标签的三维到二维渲染损失进行训练。3)最后,针对GVD输出的鸟瞰图投影三维密度,提出垂直感知鸟瞰图正则化方法(VBR),约束其呈现类似自然行人姿态的垂直形态。在广泛使用的多视角行人检测基准数据集Wildtrack、Terrace和MultiviewX上的大量实验表明,作为据我们所知首个全无监督方法,所提出的UMPD方法性能可与先前有监督的最先进技术相媲美。代码将公开。