Object detection is a fundamental problem in computer vision, aiming at locating and classifying objects in image. Although current devices can easily take very high-resolution images, current approaches of object detection seldom consider detecting tiny object or the large scale variance problem in high resolution images. In this paper, we introduce a simple yet efficient approach that improves accuracy of object detection especially for small objects and large scale variance scene while reducing the computational cost in high resolution image. Inspired by observing that overall detection accuracy is reduced if the image is properly down-sampled but the recall rate is not significantly reduced. Besides, small objects can be better detected by inputting high-resolution images even if using lightweight detector. We propose a cluster-based coarse-to-fine object detection framework to enhance the performance for detecting small objects while ensure the accuracy of large objects in high-resolution images. For the first stage, we perform coarse detection on the down-sampled image and center localization of small objects by lightweight detector on high-resolution image, and then obtains image chips based on cluster region generation method by coarse detection and center localization results, and further sends chips to the second stage detector for fine detection. Finally, we merge the coarse detection and fine detection results. Our approach can make good use of the sparsity of the objects and the information in high-resolution image, thereby making the detection more efficient. Experiment results show that our proposed approach achieves promising performance compared with other state-of-the-art detectors.
翻译:目标检测是计算机视觉中的基础问题,旨在定位并分类图像中的目标。尽管当前设备能够轻松获取极高分辨率的图像,但现有目标检测方法鲜少考虑高分辨率图像中的小目标检测或尺度差异巨大的问题。本文提出一种简洁高效的方案,能在降低高分辨率图像计算成本的同时,提升目标检测精度(尤其是小目标和尺度差异大的场景)。受以下观察启发:适当降采样图像虽不会严重降低召回率,但整体检测精度会下降;而即使使用轻量级检测器,输入高分辨率图像也能更好地检测小目标。我们提出基于聚类的粗细粒度目标检测框架,以增强高分辨率图像中小目标的检测性能,同时确保大目标的精度。第一阶段:在降采样图像上进行粗检测,并通过轻量级检测器对高分辨率图像中的小目标进行中心定位;基于粗检测与中心定位结果,采用聚类区域生成方法获取图像切片;随后将这些切片送入第二阶段检测器进行精检测。最后,合并粗检测与精检测结果。本方法能充分利用目标稀疏性与高分辨率图像信息,从而提升检测效率。实验结果表明,与当前最先进检测器相比,本方法取得了显著性能优势。