Super-resolution can play an essential role in enhancing the spatial fidelity of Earth System Model outputs, allowing fine-scale structures highly beneficial to climate science to be recovered from coarse simulations. However, traditional deep super-resolution methods, including convolutional and transformer based models, tend to exhibit spectral bias, reconstructing low-frequency content more readily than valuable high-frequency details. In this work, we introduce ViSIR and ViFOR, two frequency-aware frameworks. ViSIR stands for the Vision Transformer-Tuned Sinusoidal Implicit Representation. ViSIR combines vision transformers with sinusoidal activations to mitigate spectral bias. ViFOR stands for the Vision Transformer Fourier Representation Network. ViFOR integrates explicit Fourier based filtering for independent low- and high-frequency learning. Evaluated on the E3SM-HR Earth system dataset across surface temperature, shortwave, and longwave fluxes, these models outperform leading Convolutional NN, Generative Networks, and vanilla transformer baselines, with ViFOR demonstrating up to 2.6~dB improvements in Peak Signal to Noise Ratio and higher Structural Similarity.
翻译:超分辨率技术对于提升地球系统模型输出的空间保真度具有关键作用,能够从粗粒度模拟中恢复对气候科学极具价值的精细尺度结构。然而,传统的深度超分辨率方法(包括基于卷积和Transformer的模型)往往存在频谱偏差,倾向于更易重建低频内容而损失宝贵的高频细节。本研究提出了ViSIR与ViFOR两种频率感知框架。ViSIR(视觉Transformer调谐正弦隐式表示)通过结合视觉Transformer与正弦激活函数来缓解频谱偏差。ViFOR(视觉Transformer傅里叶表示网络)则集成显式的傅里叶基滤波机制,实现对低频与高频成分的独立学习。在E3SM-HR地球系统数据集上针对地表温度、短波辐射通量和长波辐射通量的评估表明,这些模型在峰值信噪比(最高提升2.6 dB)和结构相似性指标上均优于主流卷积神经网络、生成网络及基础Transformer基准模型,其中ViFOR表现尤为突出。