HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model

Di Wang,Meiqi Hu,Yao Jin,Yuchun Miao,Jiaqi Yang,Yichu Xu,Xiaolei Qin,Jiaqi Ma,Lingyu Sun,Chenxing Li,Chuan Fu,Hongruixuan Chen,Chengxi Han,Naoto Yokoya,Jing Zhang,Minqiang Xu,Lin Liu,Lefei Zhang,Chen Wu,Bo Du,Dacheng Tao,Liangpei Zhang

from arxiv, Accepted by IEEE TPAMI. Project website: https://whu-sigma.github.io/HyperSIGMA

Accurate hyperspectral image (HSI) interpretation is critical for providing valuable insights into various earth observation-related applications such as urban planning, precision agriculture, and environmental monitoring. However, existing HSI processing methods are predominantly task-specific and scene-dependent, which severely limits their ability to transfer knowledge across tasks and scenes, thereby reducing the practicality in real-world applications. To address these challenges, we present HyperSIGMA, a vision transformer-based foundation model that unifies HSI interpretation across tasks and scenes, scalable to over one billion parameters. To overcome the spectral and spatial redundancy inherent in HSIs, we introduce a novel sparse sampling attention (SSA) mechanism, which effectively promotes the learning of diverse contextual features and serves as the basic block of HyperSIGMA. HyperSIGMA integrates spatial and spectral features using a specially designed spectral enhancement module. In addition, we construct a large-scale hyperspectral dataset, HyperGlobal-450K, for pre-training, which contains about 450K hyperspectral images, significantly surpassing existing datasets in scale. Extensive experiments on various high-level and low-level HSI tasks demonstrate HyperSIGMA's versatility and superior representational capability compared to current state-of-the-art methods. Moreover, HyperSIGMA shows significant advantages in scalability, robustness, cross-modal transferring capability, real-world applicability, and computational efficiency. The code and models will be released at https://github.com/WHU-Sigma/HyperSIGMA.

翻译：准确的高光谱图像（HSI）解译对于为城市规划、精准农业和环境监测等各种地球观测相关应用提供有价值的见解至关重要。然而，现有的高光谱图像处理方法大多是针对特定任务和场景设计的，这严重限制了它们在任务和场景间迁移知识的能力，从而降低了在实际应用中的实用性。为应对这些挑战，我们提出了HyperSIGMA，一个基于视觉Transformer的基础模型，它统一了跨任务和场景的高光谱图像解译，并可扩展至超过十亿参数。为克服高光谱图像固有的光谱和空间冗余，我们引入了一种新颖的稀疏采样注意力（SSA）机制，该机制有效促进了多样化上下文特征的学习，并作为HyperSIGMA的基本构建模块。HyperSIGMA通过专门设计的光谱增强模块整合空间和光谱特征。此外，我们构建了一个大规模高光谱数据集HyperGlobal-450K用于预训练，该数据集包含约45万张高光谱图像，在规模上显著超越了现有数据集。在各种高水平和低水平高光谱图像任务上进行的大量实验表明，与当前最先进的方法相比，HyperSIGMA具有多功能性和卓越的表征能力。此外，HyperSIGMA在可扩展性、鲁棒性、跨模态迁移能力、实际应用适用性和计算效率方面展现出显著优势。代码和模型将在 https://github.com/WHU-Sigma/HyperSIGMA 发布。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日