Detecting objects in aerial images is challenging because they are typically composed of crowded small objects distributed non-uniformly over high-resolution images. Density cropping is a widely used method to improve this small object detection where the crowded small object regions are extracted and processed in high resolution. However, this is typically accomplished by adding other learnable components, thus complicating the training and inference over a standard detection process. In this paper, we propose an efficient Cascaded Zoom-in (CZ) detector that re-purposes the detector itself for density-guided training and inference. During training, density crops are located, labeled as a new class, and employed to augment the training dataset. During inference, the density crops are first detected along with the base class objects, and then input for a second stage of inference. This approach is easily integrated into any detector, and creates no significant change in the standard detection process, like the uniform cropping approach popular in aerial image detection. Experimental results on the aerial images of the challenging VisDrone and DOTA datasets verify the benefits of the proposed approach. The proposed CZ detector also provides state-of-the-art results over uniform cropping and other density cropping methods on the VisDrone dataset, increasing the detection mAP of small objects by more than 3 points.
翻译:在航空图像中检测目标具有挑战性,因为这类图像通常由分布不均匀的密集小目标构成于高分辨率图像中。密度裁剪是一种广泛采用的方法,通过提取高分辨率下的密集小目标区域来改善小目标检测。然而,该方法通常需要添加额外的可学习组件,从而增加标准检测流程的训练和推理复杂性。本文提出了一种高效的级联缩放(CZ)检测器,该检测器直接复用检测器自身实现密度引导的训练与推理。训练阶段,通过定位密度裁剪区域并将其标注为新类别,用于扩充训练数据集;推理阶段,首先检测密度裁剪区域与基础类别目标,随后将结果输入第二阶段进行推理。该方法可轻松集成到任意检测器中,且不会像航空图像检测中流行的均匀裁剪方法那样显著改变标准检测流程。在具有挑战性的VisDrone和DOTA数据集上的实验验证了本方法的有效性。所提出的CZ检测器在VisDrone数据集上相比均匀裁剪及其他密度裁剪方法取得了最优结果,使小目标检测mAP提升超过3个百分点。