Object detection in aerial images is a fundamental research topic in the geoscience and remote sensing domain. However, the advanced approaches on this topic mainly focus on designing the elaborate backbones or head networks but ignore neck networks. In this letter, we first underline the importance of the neck network in object detection from the perspective of information bottleneck. Then, to alleviate the information deficiency problem in the current approaches, we propose a global semantic network (GSNet), which acts as a bridge from the backbone network to the head network in a bidirectional global pattern. Compared to the existing approaches, our model can capture the rich and enhanced image features with less computational costs. Besides, we further propose a feature fusion refinement module (FRM) for different levels of features, which are suffering from the problem of semantic gap in feature fusion. To demonstrate the effectiveness and efficiency of our approach, experiments are carried out on two challenging and representative aerial image datasets (i.e., DOTA and HRSC2016). Experimental results in terms of accuracy and complexity validate the superiority of our method. The code has been open-sourced at GSNet.
翻译:航空图像中的目标检测是地球科学与遥感领域的基础研究方向。然而,该领域的先进方法主要聚焦于设计精密的骨干网络或头部网络,却忽略了颈部网络。本文首先从信息瓶颈视角强调了颈部网络在目标检测中的重要性。为解决现有方法中的信息缺失问题,我们提出全局语义网络(GSNet),该网络以双向全局模式作为骨干网络与头部网络之间的桥梁。与现有方法相比,本模型能以更低的计算成本捕获丰富且增强的图像特征。此外,针对不同层级特征因语义鸿沟导致的融合问题,我们进一步提出特征融合精炼模块(FRM)。为验证方法的有效性与高效性,我们在两个具有挑战性且具代表性的航空图像数据集(即DOTA与HRSC2016)上开展实验。精度与复杂度实验结果表明了本方法的优越性。代码已在GSNet开源。