Single-frame infrared small target detection is considered to be a challenging task, due to the extreme imbalance between target and background, bounding box regression is extremely sensitive to infrared small targets, and small target information is easy to lose in the high-level semantic layer. In this paper, we propose an enhancing feature learning network (EFLNet) based on YOLOv7 framework to solve these problems. First, we notice that there is an extremely imbalance between the target and the background in the infrared image, which makes the model pay more attention to the background features, resulting in missed detection. To address this problem, we propose a new adaptive threshold focal loss function that adjusts the loss weight automatically, compelling the model to allocate greater attention to target features. Second, we introduce the normalized Gaussian Wasserstein distance to alleviate the difficulty of model convergence caused by the extreme sensitivity of the bounding box regression to infrared small targets. Finally, we incorporate a dynamic head mechanism into the network to enable adaptive learning of the relative importance of each semantic layer. Experimental results demonstrate our method can achieve better performance in the detection performance of infrared small targets compared to state-of-the-art deep-learning based methods.
翻译:单帧红外小目标检测被认为是一项具有挑战性的任务,原因在于目标与背景之间存在极度不平衡、边界框回归对红外小目标极为敏感、以及小目标信息在高语义层中容易丢失。本文基于YOLOv7框架提出了一种增强特征学习网络(EFLNet)以解决上述问题。首先,我们注意到红外图像中目标与背景之间存在极度不平衡,这使得模型更关注背景特征,导致漏检。针对该问题,我们提出了一种自适应阈值焦点损失函数,可自动调整损失权重,迫使模型更多地关注目标特征。其次,我们引入归一化高斯瓦瑟斯坦距离,以缓解边界框回归对红外小目标极度敏感所导致的模型收敛困难。最后,我们在网络中融入动态头机制,使各语义层的相对重要性能够自适应学习。实验结果表明,与基于深度学习的最先进方法相比,我们的方法在红外小目标检测性能上能取得更优结果。