Recently, the FourCastNet Neural Earth System Model (NESM) has shown impressive results on predicting various atmospheric variables, trained on the ERA5 reanalysis dataset. While FourCastNet enjoys quasi-linear time and memory complexity in sequence length compared to quadratic complexity in vanilla transformers, training FourCastNet on ERA5 from scratch still requires large amount of compute resources, which is expensive or even inaccessible to most researchers. In this work, we will show improved methods that can train FourCastNet using only 1% of the compute required by the baseline, while maintaining model performance or par or even better than the baseline.
翻译:近期,基于ERA5再分析数据集训练的FourCastNet神经地球系统模型(NESM)在预测多种大气变量方面展现出卓越性能。尽管FourCastNet相较于经典Transformer在序列长度上实现了准线性时间与内存复杂度(而非二次复杂度),但从头开始训练ERA5上的FourCastNet仍需大量计算资源,这对多数研究者而言成本高昂甚至难以企及。本研究将展示改进方法,能在仅使用基线模型所需1%计算资源的情况下完成FourCastNet训练,并保持模型性能持平甚至超越基线水平。