ClimateGPT: Towards AI Synthesizing Interdisciplinary Research on Climate Change

David Thulke,Yingbo Gao,Petrus Pelser,Rein Brune,Rricha Jalota,Floris Fok,Michael Ramos,Ian van Wyk,Abdallah Nasir,Hayden Goldstein,Taylor Tragemann,Katie Nguyen,Ariana Fowler,Andrew Stanco,Jon Gabriel,Jordan Taylor,Dean Moro,Evgenii Tsymbalov,Juliette de Waal,Evgeny Matusov,Mudar Yaghi,Mohammad Shihadah,Hermann Ney,Christian Dugast,Jonathan Dotan,Daniel Erasmus

This paper introduces ClimateGPT, a model family of domain-specific large language models that synthesize interdisciplinary research on climate change. We trained two 7B models from scratch on a science-oriented dataset of 300B tokens. For the first model, the 4.2B domain-specific tokens were included during pre-training and the second was adapted to the climate domain after pre-training. Additionally, ClimateGPT-7B, 13B and 70B are continuously pre-trained from Llama~2 on a domain-specific dataset of 4.2B tokens. Each model is instruction fine-tuned on a high-quality and human-generated domain-specific dataset that has been created in close cooperation with climate scientists. To reduce the number of hallucinations, we optimize the model for retrieval augmentation and propose a hierarchical retrieval strategy. To increase the accessibility of our model to non-English speakers, we propose to make use of cascaded machine translation and show that this approach can perform comparably to natively multilingual models while being easier to scale to a large number of languages. Further, to address the intrinsic interdisciplinary aspect of climate change we consider different research perspectives. Therefore, the model can produce in-depth answers focusing on different perspectives in addition to an overall answer. We propose a suite of automatic climate-specific benchmarks to evaluate LLMs. On these benchmarks, ClimateGPT-7B performs on par with the ten times larger Llama-2-70B Chat model while not degrading results on general domain benchmarks. Our human evaluation confirms the trends we saw in our benchmarks. All models were trained and evaluated using renewable energy and are released publicly.

翻译：本文介绍ClimateGPT——一个专门用于综合气候变化跨学科研究的领域特定大语言模型家族。我们在一个包含300B token的科学导向数据集上从头训练了两个7B模型：第一个模型在预训练阶段纳入了4.2B领域特定token，第二个模型则在预训练后针对气候领域进行适配。此外，ClimateGPT-7B、13B和70B模型均基于Llama~2在4.2B token的领域特定数据集上持续预训练。每个模型均在高质量、由人类生成的领域特定数据集上进行指令微调，该数据集与气候科学家密切合作构建。为减少幻觉现象，我们优化了模型的检索增强能力并提出分层检索策略。为提升非英语使用者的模型可及性，我们提出利用级联机器翻译方法，并证明该方法在性能上与原生多语言模型相当，且更易扩展至大规模语言。针对气候变化固有的跨学科特性，我们充分考虑不同研究视角，使模型不仅能给出整体性回答，还能针对不同视角提供深度解析。我们提出一套自动化气候特定基准来评估大语言模型。在这些基准上，ClimateGPT-7B的表现与规模大十倍的Llama-2-70B Chat模型持平，同时不降低通用领域基准的评测结果。人工评估验证了我们基准测试中观察到的趋势。所有模型均使用可再生能源进行训练和评估，并已公开发布。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日