Small object detection has been a challenging problem in the field of object detection. There has been some works that proposes improvements for this task, such as adding several attention blocks or changing the whole structure of feature fusion networks. However, the computation cost of these models is large, which makes deploying a real-time object detection system unfeasible, while leaving room for improvement. To this end, an improved YOLOv5 model: HIC-YOLOv5 is proposed to address the aforementioned problems. Firstly, an additional prediction head specific to small objects is added to provide a higher-resolution feature map for better prediction. Secondly, an involution block is adopted between the backbone and neck to increase channel information of the feature map. Moreover, an attention mechanism named CBAM is applied at the end of the backbone, thus not only decreasing the computation cost compared with previous works but also emphasizing the important information in both channel and spatial domain. Our result shows that HIC-YOLOv5 has improved mAP@[.5:.95] by 6.42% and [email protected] by 9.38% on VisDrone-2019-DET dataset.
翻译:小目标检测一直是目标检测领域的难点问题。现有一些工作针对该任务提出了改进方法,例如添加多个注意力模块或改变特征融合网络的整体结构。然而,这些模型计算成本较高,导致实时目标检测系统的部署难以实现,且仍有改进空间。为此,本文提出一种改进的YOLOv5模型——HIC-YOLOv5,以解决上述问题。首先,新增一个专门针对小目标的预测头,提供更高分辨率的特征图以实现更优预测。其次,在骨干网络与颈部网络之间引入内卷模块,以增强特征图的通道信息。此外,在骨干网络末端应用名为CBAM的注意力机制,不仅相比先前工作降低了计算成本,还能在通道域和空间域上突出重要信息。结果表明,在VisDrone-2019-DET数据集上,HIC-YOLOv5的mAP@[.5:.95]提升了6.42%,[email protected]提升了9.38%。