Single-frame infrared small target detection is considered to be a challenging task, due to the extreme imbalance between target and background, bounding box regression is extremely sensitive to infrared small target, and target information is easy to lose in the high-level semantic layer. In this article, we propose an enhancing feature learning network (EFLNet) to address these problems. First, we notice that there is an extremely imbalance between the target and the background in the infrared image, which makes the model pay more attention to the background features rather than target features. To address this problem, we propose a new adaptive threshold focal loss (ATFL) function that decouples the target and the background, and utilizes the adaptive mechanism to adjust the loss weight to force the model to allocate more attention to target features. Second, we introduce the normalized Gaussian Wasserstein distance (NWD) to alleviate the difficulty of convergence caused by the extreme sensitivity of the bounding box regression to infrared small target. Finally, we incorporate a dynamic head mechanism into the network to enable adaptive learning of the relative importance of each semantic layer. Experimental results demonstrate our method can achieve better performance in the detection performance of infrared small target compared to the state-of-the-art (SOTA) deep-learning-based methods. The source codes and bounding box annotated datasets are available at https://github.com/YangBo0411/infrared-small-target.
翻译:单帧红外小目标检测是一项具有挑战性的任务,这主要是由于目标与背景之间存在极度不平衡,边界框回归对红外小目标极为敏感,以及高层语义层中目标信息容易丢失。本文提出了一种增强特征学习网络(EFLNet)来解决这些问题。首先,我们发现红外图像中目标与背景之间存在极度不平衡,这使得模型更关注背景特征而非目标特征。为解决此问题,我们提出了一种新的自适应阈值焦点损失(ATFL)函数,该函数将目标与背景解耦,并利用自适应机制调整损失权重,迫使模型将更多注意力分配给目标特征。其次,我们引入归一化高斯瓦瑟斯坦距离(NWD),以缓解因边界框回归对红外小目标极度敏感而导致的收敛困难。最后,我们在网络中融入动态头部机制,使网络能够自适应学习每个语义层的相对重要性。实验结果表明,与当前最先进的(SOTA)基于深度学习方法相比,我们的方法在红外小目标检测性能上能取得更优效果。源代码及边界框标注数据集已公开于https://github.com/YangBo0411/infrared-small-target。