HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model

Di Wang,Meiqi Hu,Yao Jin,Yuchun Miao,Jiaqi Yang,Yichu Xu,Xiaolei Qin,Jiaqi Ma,Lingyu Sun,Chenxing Li,Chuan Fu,Hongruixuan Chen,Chengxi Han,Naoto Yokoya,Jing Zhang,Minqiang Xu,Lin Liu,Lefei Zhang,Chen Wu,Bo Du,Dacheng Tao,Liangpei Zhang

from arxiv, The code and models will be released at https://github.com/WHU-Sigma/HyperSIGMA

Foundation models (FMs) are revolutionizing the analysis and understanding of remote sensing (RS) scenes, including aerial RGB, multispectral, and SAR images. However, hyperspectral images (HSIs), which are rich in spectral information, have not seen much application of FMs, with existing methods often restricted to specific tasks and lacking generality. To fill this gap, we introduce HyperSIGMA, a vision transformer-based foundation model for HSI interpretation, scalable to over a billion parameters. To tackle the spectral and spatial redundancy challenges in HSIs, we introduce a novel sparse sampling attention (SSA) mechanism, which effectively promotes the learning of diverse contextual features and serves as the basic block of HyperSIGMA. HyperSIGMA integrates spatial and spectral features using a specially designed spectral enhancement module. In addition, we construct a large-scale hyperspectral dataset, HyperGlobal-450K, for pre-training, which contains about 450K hyperspectral images, significantly surpassing existing datasets in scale. Extensive experiments on various high-level and low-level HSI tasks demonstrate HyperSIGMA's versatility and superior representational capability compared to current state-of-the-art methods. Moreover, HyperSIGMA shows significant advantages in scalability, robustness, cross-modal transferring capability, and real-world applicability.

翻译：基础模型（FMs）正在彻底改变对遥感（RS）场景（包括航空RGB、多光谱和SAR图像）的分析与理解。然而，富含光谱信息的高光谱图像（HSIs）尚未得到基础模型的广泛应用，现有方法通常局限于特定任务且缺乏通用性。为填补这一空白，我们提出了HyperSIGMA，一个基于视觉Transformer的高光谱图像解译基础模型，其参数规模可扩展至数十亿。为应对高光谱图像中存在的谱间与空间冗余挑战，我们引入了一种新颖的稀疏采样注意力（SSA）机制，该机制有效促进了多样化上下文特征的学习，并作为HyperSIGMA的基本构建模块。HyperSIGMA通过专门设计的光谱增强模块整合空间与光谱特征。此外，我们构建了一个大规模高光谱数据集HyperGlobal-450K用于预训练，该数据集包含约45万张高光谱图像，在规模上显著超越了现有数据集。在各种高级与低级高光谱任务上的大量实验表明，与当前最先进方法相比，HyperSIGMA具有卓越的通用性和表征能力。此外，HyperSIGMA在可扩展性、鲁棒性、跨模态迁移能力以及实际应用性方面均展现出显著优势。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日