Recurrent neural networks and Transformers have recently dominated most applications in hyperspectral (HS) imaging, owing to their capability to capture long-range dependencies from spectrum sequences. However, despite the success of these sequential architectures, the non-ignorable inefficiency caused by either difficulty in parallelization or computationally prohibitive attention still hinders their practicality, especially for large-scale observation in remote sensing scenarios. To address this issue, we herein propose SpectralMamba -- a novel state space model incorporated efficient deep learning framework for HS image classification. SpectralMamba features the simplified but adequate modeling of HS data dynamics at two levels. First, in spatial-spectral space, a dynamical mask is learned by efficient convolutions to simultaneously encode spatial regularity and spectral peculiarity, thus attenuating the spectral variability and confusion in discriminative representation learning. Second, the merged spectrum can then be efficiently operated in the hidden state space with all parameters learned input-dependent, yielding selectively focused responses without reliance on redundant attention or imparallelizable recurrence. To explore the room for further computational downsizing, a piece-wise scanning mechanism is employed in-between, transferring approximately continuous spectrum into sequences with squeezed length while maintaining short- and long-term contextual profiles among hundreds of bands. Through extensive experiments on four benchmark HS datasets acquired by satellite-, aircraft-, and UAV-borne imagers, SpectralMamba surprisingly creates promising win-wins from both performance and efficiency perspectives.
翻译:递归神经网络与Transformer凭借其从光谱序列中捕获长程依赖关系的能力,近年来主导了高光谱成像领域的大部分应用。然而,尽管这些序列架构取得了成功,但并行化困难或计算代价高昂的注意力机制所导致的不可忽视的低效性,仍阻碍了其实用性,尤其在遥感场景的大规模观测中。针对这一问题,本文提出SpectralMamba——一种融合状态空间模型的高效深度学习框架,用于高光谱图像分类。SpectralMamba的特色在于从两个层次对高光谱数据动态过程进行简化而充分的建模。首先,在空间-光谱空间中,通过高效卷积学习动态掩码,同时编码空间规则性和光谱特异性,从而在判别性表示学习中削弱光谱变异性与混淆性。其次,融合后的光谱可在隐藏状态空间中高效操作,且所有参数均为输入依赖型学习,无需依赖冗余注意力机制或不可并行的递归即可产生选择性聚焦响应。为探索进一步降低计算开销的空间,本文在中间引入逐段扫描机制,将近乎连续的光谱转换为长度压缩的序列,同时维持数百个波段间的短程与长程上下文特征。通过在星载、机载及无人机载成像器获取的四个基准高光谱数据集上的大量实验,SpectralMamba在性能与效率两方面均展现出令人惊喜的共赢效果。