Land cover analysis using hyperspectral images (HSI) remains an open problem due to their low spatial resolution and complex spectral information. Recent studies are primarily dedicated to designing Transformer-based architectures for spatial-spectral long-range dependencies modeling, which is computationally expensive with quadratic complexity. Selective structured state space model (Mamba), which is efficient for modeling long-range dependencies with linear complexity, has recently shown promising progress. However, its potential in hyperspectral image processing that requires handling numerous spectral bands has not yet been explored. In this paper, we innovatively propose S$^2$Mamba, a spatial-spectral state space model for hyperspectral image classification, to excavate spatial-spectral contextual features, resulting in more efficient and accurate land cover analysis. In S$^2$Mamba, two selective structured state space models through different dimensions are designed for feature extraction, one for spatial, and the other for spectral, along with a spatial-spectral mixture gate for optimal fusion. More specifically, S$^2$Mamba first captures spatial contextual relations by interacting each pixel with its adjacent through a Patch Cross Scanning module and then explores semantic information from continuous spectral bands through a Bi-directional Spectral Scanning module. Considering the distinct expertise of the two attributes in homogenous and complicated texture scenes, we realize the Spatial-spectral Mixture Gate by a group of learnable matrices, allowing for the adaptive incorporation of representations learned across different dimensions. Extensive experiments conducted on HSI classification benchmarks demonstrate the superiority and prospect of S$^2$Mamba. The code will be made available at: https://github.com/PURE-melo/S2Mamba.
翻译:利用高光谱图像进行土地覆盖分析,因其空间分辨率低且光谱信息复杂,仍是一个开放性问题。近期研究主要致力于设计基于Transformer的架构以建模空谱长程依赖关系,但该方法计算成本高昂,具有二次复杂度。选择性结构化状态空间模型(Mamba)能够以线性复杂度高效建模长程依赖,近期已展现出良好进展。然而,其在需要处理大量光谱波段的高光谱图像处理中的潜力尚未得到探索。本文创新性地提出S$^2$Mamba,一种用于高光谱图像分类的空谱状态空间模型,以挖掘空谱上下文特征,从而实现更高效、更准确的土地覆盖分析。在S$^2$Mamba中,设计了两个沿不同维度的选择性结构化状态空间模型进行特征提取,一个用于空间维度,另一个用于光谱维度,并辅以一个空谱混合门以实现最优融合。具体而言,S$^2$Mamba首先通过Patch Cross Scanning模块,使每个像素与其相邻像素交互以捕获空间上下文关系,随后通过Bi-directional Spectral Scanning模块从连续光谱波段中探索语义信息。考虑到两种属性在同质和复杂纹理场景中的不同专长,我们通过一组可学习矩阵实现了空谱混合门,从而能够自适应地融合不同维度学习到的表征。在高光谱图像分类基准数据集上进行的大量实验证明了S$^2$Mamba的优越性与前景。代码将在以下地址公开:https://github.com/PURE-melo/S2Mamba。