Atmospheric science is intricately connected with other fields, e.g., geography and aerospace. Most existing approaches involve training a joint atmospheric and geographic model from scratch, which incurs significant computational costs and overlooks the potential for incremental learning of weather variables across different domains. In this paper, we introduce incremental learning to weather forecasting and propose a novel structure that allows for the flexible expansion of variables within the model. Specifically, our method presents a Channel-Adapted MoE (CA-MoE) that employs a divide-and-conquer strategy. This strategy assigns variable training tasks to different experts by index embedding and reduces computational complexity through a channel-wise Top-K strategy. Experiments conducted on the widely utilized ERA5 dataset reveal that our method, utilizing only approximately 15\% of trainable parameters during the incremental stage, attains performance that is on par with state-of-the-art competitors. Notably, in the context of variable incremental experiments, our method demonstrates negligible issues with catastrophic forgetting.
翻译:大气科学与其他领域(如地理学和航空航天学)有着错综复杂的联系。现有方法大多涉及从头开始训练一个联合的大气与地理模型,这会产生巨大的计算成本,并且忽视了在不同领域间对天气变量进行增量学习的潜力。本文首次将增量学习引入天气预报领域,并提出了一种新颖的模型结构,允许在模型内部灵活扩展变量。具体而言,我们的方法提出了一种通道自适应混合专家模型(CA-MoE),它采用了一种分而治之的策略。该策略通过索引嵌入将变量训练任务分配给不同的专家,并通过一种通道维度的Top-K策略来降低计算复杂度。在广泛使用的ERA5数据集上进行的实验表明,我们的方法在增量学习阶段仅使用约15%的可训练参数,即可达到与最先进竞争对手相当的性能。值得注意的是,在变量增量实验中,我们的方法表现出的灾难性遗忘问题微乎其微。