Object Detection in Hyperspectral Image via Unified Spectral-Spatial Feature Aggregation

Deep learning-based hyperspectral image (HSI) classification and object detection techniques have gained significant attention due to their vital role in image content analysis, interpretation, and wider HSI applications. However, current hyperspectral object detection approaches predominantly emphasize either spectral or spatial information, overlooking the valuable complementary relationship between these two aspects. In this study, we present a novel \textbf{S}pectral-\textbf{S}patial \textbf{A}ggregation (S2ADet) object detector that effectively harnesses the rich spectral and spatial complementary information inherent in hyperspectral images. S2ADet comprises a hyperspectral information decoupling (HID) module, a two-stream feature extraction network, and a one-stage detection head. The HID module processes hyperspectral images by aggregating spectral and spatial information via band selection and principal components analysis, consequently reducing redundancy. Based on the acquired spatial and spectral aggregation information, we propose a feature aggregation two-stream network for interacting spectral-spatial features. Furthermore, to address the limitations of existing databases, we annotate an extensive dataset, designated as HOD3K, containing 3,242 hyperspectral images captured across diverse real-world scenes and encompassing three object classes. These images possess a resolution of 512x256 pixels and cover 16 bands ranging from 470 nm to 620 nm. Comprehensive experiments on two datasets demonstrate that S2ADet surpasses existing state-of-the-art methods, achieving robust and reliable results. The demo code and dataset of this work are publicly available at \url{https://github.com/hexiao-cs/S2ADet}.

翻译：基于深度学习的高光谱图像分类与目标检测技术因其在图像内容分析、解译及更广泛的高光谱应用中的重要作用而备受关注。然而，现有高光谱目标检测方法主要侧重于光谱或空间信息中的单一维度，忽视了两者之间宝贵的互补关系。本研究提出了一种新颖的光谱-空间聚合检测器S2ADet，该检测器有效利用高光谱图像中丰富的光谱与空间互补信息。S2ADet包含高光谱信息解耦模块、双流特征提取网络和单阶段检测头。高光谱信息解耦模块通过波段选择与主成分分析聚合光谱与空间信息，从而降低数据冗余。基于获取的空间与光谱聚合信息，我们提出了一种用于交互光谱-空间特征的特征聚合双流网络。此外，为弥补现有数据库的不足，我们标注了一个大规模数据集HOD3K，包含3,242张涵盖不同真实场景的高光谱图像，并涵盖三类目标。这些图像分辨率为512×256像素，覆盖470 nm至620 nm的16个波段。在两个数据集上的综合实验表明，S2ADet的性能超越了现有最先进方法，取得了稳健可靠的结果。本研究的演示代码与数据集已在https://github.com/hexiao-cs/S2ADet 公开。