Hyperspectral image classification (HSIC) is a challenging task due to high spectral dimensionality, complex spectral-spatial correlations, and limited labeled training samples. Although transformer-based models have shown strong potential for HSIC, existing approaches often struggle to achieve sufficient spectral discriminability while maintaining computational efficiency. To address these limitations, we propose a novel DSXFormer, a novel dual-pooling spectral squeeze-expansion transformer with Dynamic Context Attention for HSIC. The proposed DSXFormer introduces a Dual-Pooling Spectral Squeeze-Expansion (DSX) block, which exploits complementary global average and max pooling to adaptively recalibrate spectral feature channels, thereby enhancing spectral discriminability and inter-band dependency modeling. In addition, DSXFormer incorporates a Dynamic Context Attention (DCA) mechanism within a window-based transformer architecture to dynamically capture local spectral-spatial relationships while significantly reducing computational overhead. The joint integration of spectral dual-pooling squeeze-expansion and DCA enables DSXFormer to achieve an effective balance between spectral emphasis and spatial contextual representation. Furthermore, patch extraction, embedding, and patch merging strategies are employed to facilitate efficient multi-scale feature learning. Extensive experiments conducted on four widely used hyperspectral benchmark datasets, including Salinas (SA), Indian Pines (IP), Pavia University (PU), and Kennedy Space Center (KSC), demonstrate that DSXFormer consistently outperforms state-of-the-art methods, achieving classification accuracies of 99.95%, 98.91%, 99.85%, and 98.52%, respectively.
翻译:高光谱图像分类(HSIC)因高光谱维度、复杂的光谱-空间关联性以及有限的标记训练样本而成为一项具有挑战性的任务。尽管基于Transformer的模型在HSIC中展现出巨大潜力,但现有方法往往难以在保持计算效率的同时实现充分的光谱判别能力。为应对这些局限,我们提出了一种新颖的DSXFormer,这是一种用于HSIC的、具有动态上下文注意力的双池化光谱挤压-扩展Transformer。所提出的DSXFormer引入了一个双池化光谱挤压-扩展(DSX)模块,该模块利用互补的全局平均池化和最大池化来自适应地重新校准光谱特征通道,从而增强光谱判别能力与波段间依赖性建模。此外,DSXFormer在一个基于窗口的Transformer架构中集成了动态上下文注意力(DCA)机制,以动态捕获局部光谱-空间关系,同时显著降低计算开销。光谱双池化挤压-扩展与DCA的联合集成使DSXFormer能够在光谱强调与空间上下文表示之间实现有效平衡。此外,通过采用图像块提取、嵌入及图像块合并策略,促进了高效的多尺度特征学习。在四个广泛使用的高光谱基准数据集(包括Salinas (SA)、Indian Pines (IP)、Pavia University (PU)和Kennedy Space Center (KSC))上进行的大量实验表明,DSXFormer始终优于现有最先进方法,分别实现了99.95%、98.91%、99.85%和98.52%的分类准确率。