Crowd counting is a challenging task due to the heavy occlusions, scales, and density variations. Existing methods handle these challenges effectively while ignoring low-resolution (LR) circumstances. The LR circumstances weaken the counting performance deeply for two crucial reasons: 1) limited detail information; 2) overlapping head regions accumulate in density maps and result in extreme ground-truth values. An intuitive solution is to employ super-resolution (SR) pre-processes for the input LR images. However, it complicates the inference steps and thus limits application potentials when requiring real-time. We propose a more elegant method termed Multi-Scale Super-Resolution Module (MSSRM). It guides the network to estimate the lost de tails and enhances the detailed information in the feature space. Noteworthy that the MSSRM is plug-in plug-out and deals with the LR problems with no inference cost. As the proposed method requires SR labels, we further propose a Super-Resolution Crowd Counting dataset (SR-Crowd). Extensive experiments on three datasets demonstrate the superiority of our method. The code will be available at https://github.com/PRIS-CV/MSSRM.git.
翻译:人群计数由于严重的遮挡、尺度变化和密度变化而具有挑战性。现有方法有效处理这些挑战,但忽略了低分辨率情况。低分辨率情况会严重削弱计数性能,原因主要有两点:1)细节信息有限;2)密度图中头部区域叠加导致极端真值。一个直观的解决方案是对输入低分辨率图像进行超分辨率预处理。然而,这会使推理步骤复杂化,从而在需要实时处理时限制应用潜力。我们提出一种更优雅的方法,称为多尺度超分辨率模块(MSSRM)。它引导网络估计丢失的细节,并增强特征空间中的细节信息。值得注意的是,MSSRM即插即用,且在不增加推理成本的情况下解决低分辨率问题。由于所提方法需要超分辨率标签,我们进一步提出了超分辨率人群计数数据集(SR-Crowd)。在三个数据集上的大量实验证明了我们方法的优越性。代码将在 https://github.com/PRIS-CV/MSSRM.git 上发布。