Learning Exhaustive Correlation for Spectral Super-Resolution: Where Unified Spatial-Spectral Attention Meets Mutual Linear Dependence

Spectral super-resolution from the easily obtainable RGB image to hyperspectral image (HSI) has drawn increasing interest in the field of computational photography. The crucial aspect of spectral super-resolution lies in exploiting the correlation within HSIs. However, two types of bottlenecks in existing Transformers limit performance improvement and practical applications. First, existing Transformers often separately emphasize either spatial-wise or spectral-wise correlation, disrupting the 3D features of HSI and hindering the exploitation of unified spatial-spectral correlation. Second, the existing self-attention mechanism learns the correlation between pairs of tokens and captures the full-rank correlation matrix, leading to its inability to establish mutual linear dependence among multiple tokens. To address these issues, we propose a novel Exhaustive Correlation Transformer (ECT) for spectral super-resolution. First, we propose a Spectral-wise Discontinuous 3D (SD3D) splitting strategy, which models unified spatial-spectral correlation by simultaneously utilizing spatial-wise continuous splitting and spectral-wise discontinuous splitting. Second, we propose a Dynamic Low-Rank Mapping (DLRM) model, which captures mutual linear dependence among multiple tokens through a dynamically calculated low-rank dependence map. By integrating unified spatial-spectral attention with mutual linear dependence, our ECT can establish exhaustive correlation within HSI. The experimental results on both simulated and real data indicate that our method achieves state-of-the-art performance. Codes and pretrained models will be available later.

翻译：从易获取的RGB图像到高光谱图像（HSI）的光谱超分辨率在计算摄影领域引起了越来越多的关注。光谱超分辨率的关键在于挖掘HSI内部的相关性。然而，现有Transformer中的两类瓶颈限制了性能提升与实际应用。首先，现有Transformer常单独强调空间维或光谱维相关性，破坏了HSI的三维特征，阻碍了统一空间-光谱相关性的利用。其次，现有自注意力机制学习成对标记之间的相关性并捕获满秩相关矩阵，导致其无法在多个标记间建立互线性依赖。为解决这些问题，我们提出了一种新颖的全面相关性Transformer（ECT）用于光谱超分辨率。首先，我们提出了光谱维不连续三维（SD3D）分割策略，通过同时利用空间维连续分割与光谱维不连续分割来建模统一的空间-光谱相关性。其次，我们提出了动态低秩映射（DLRM）模型，通过动态计算的低秩依赖图捕获多个标记间的互线性依赖。通过将统一空间-光谱注意力与互线性依赖相结合，我们的ECT能够建立HSI内部的全面相关性。在模拟数据和真实数据上的实验结果表明，我们的方法达到了最优性能。代码与预训练模型将稍后公开。