Large foundation models (FMs) are transforming Earth science by integrating heterogeneous multimodal data, such as multi-platform imagery, gridded reanalysis data, diverse geophysical and geochemical observations, and domain-specific text, to support tasks ranging from basic perception to advanced scientific discovery. This paper provides a unified review of Earth science foundation models (Earth FMs) through two complementary dimensions: depth, which traces the evolution of model capabilities from perception to multimodal reasoning and agentic scientific workflows, and breadth, which summarizes their expanding applications across the atmosphere, hydrosphere, lithosphere, biosphere, anthroposphere, and cryosphere, as well as coupled Earth system processes. Using this framework, we review representative multimodal Earth foundation models and compile more than 200 datasets and benchmarks spanning diverse Earth science tasks and modalities. We further discuss key challenges in multimodal data heterogeneity, scientific reliability and continual updating, scalability and sustainability, and the transition from foundation models to agentic and embodied Earth intelligence, and outline future directions toward more integrated, trustworthy, and actionable AI Earth scientists. Overall, this paper offers a structured roadmap for understanding the development of Earth foundation models from both capability depth and application breadth.
翻译:大型基础模型通过整合多平台影像、网格化再分析数据、多样化地球物理与地球化学观测以及领域特定文本等异构多模态数据,正在推动地球科学领域的变革,支持从基础感知到高级科学发现的各类任务。本文从两个互补维度对地球科学基础模型进行了统一综述:深度维度追踪了模型能力从感知到多模态推理及智能体科学工作流的演进;广度维度则总结了其在气圈、水圈、岩石圈、生物圈、人类圈、冰冻圈以及耦合地球系统过程中的应用拓展。基于该框架,我们回顾了代表性的多模态地球基础模型,并整理了涵盖多种地球科学任务与模态的200余个数据集与基准测试。本文进一步探讨了多模态数据异质性、科学可靠性与持续更新、可拓展性与可持续性、以及从基础模型向智能体与具身地球智能过渡的关键挑战,并展望了构建更集成、可信且可操作的人工智能地球科学家的未来方向。总体而言,本文从能力深度与应用广度两个维度,为理解地球基础模型的发展提供了结构化的路线图。