Efficiently modeling spatio-temporal (ST) physical processes and observations presents a challenging problem for the deep learning community. Many recent studies have concentrated on meticulously reconciling various advantages, leading to designed models that are neither simple nor practical. To address this issue, this paper presents a systematic study on existing shortcomings faced by off-the-shelf models, including lack of local fidelity, poor prediction performance over long time-steps,low scalability, and inefficiency. To systematically address the aforementioned problems, we propose an EarthFarseer, a concise framework that combines parallel local convolutions and global Fourier-based transformer architectures, enabling dynamically capture the local-global spatial interactions and dependencies. EarthFarseer also incorporates a multi-scale fully convolutional and Fourier architectures to efficiently and effectively capture the temporal evolution. Our proposal demonstrates strong adaptability across various tasks and datasets, with fast convergence and better local fidelity in long time-steps predictions. Extensive experiments and visualizations over eight human society physical and natural physical datasets demonstrates the state-of-the-art performance of EarthFarseer. We release our code at https://github.com/easylearningscores/EarthFarseer.
翻译:高效建模时空物理过程与观测数据是深度学习领域的一项挑战性课题。近期诸多研究专注于细致调和不同方法的优势,导致所设计的模型既不够简洁也不具备实用性。针对这一问题,本文系统梳理了现有模型面临的缺陷,包括局部保真度不足、长时步预测表现欠佳、可扩展性差以及效率低下。为系统性解决上述问题,我们提出EarthFarseer——一个融合并行局部卷积与全局傅里叶变换架构的简洁框架,能够动态捕捉局部-全局的空间交互与依赖关系。该模型还引入多尺度全卷积与傅里叶架构,以高效精准地刻画时间演化过程。EarthFarseer在多种任务与数据集上展现出强适应性,具备快速收敛特性,在长时步预测中实现更优的局部保真度。基于八个人类社会物理与自然物理数据集的广泛实验与可视化结果表明,EarthFarseer达到了当前最优性能。我们已在https://github.com/easylearningscores/EarthFarseer 公开代码。