AcademicGPT: Empowering Academic Research

Shufa Wei,Xiaolong Xu,Xianbiao Qi,Xi Yin,Jun Xia,Jingyi Ren,Peijun Tang,Yuxiang Zhong,Yihao Chen,Xiaoqin Ren,Yuxin Liang,Liankai Huang,Kai Xie,Weikang Gui,Wei Tan,Shuanglong Sun,Yongquan Hu,Qinxian Liu,Nanjin Li,Chihao Dai,Lihua Wang,Xiaohui Liu,Lei Zhang,Yutao Xie

from arxiv, Technical Report. arXiv admin note: text overlap with arXiv:2310.12081, arXiv:2310.10053 by other authors

Large Language Models (LLMs) have demonstrated exceptional capabilities across various natural language processing tasks. Yet, many of these advanced LLMs are tailored for broad, general-purpose applications. In this technical report, we introduce AcademicGPT, designed specifically to empower academic research. AcademicGPT is a continual training model derived from LLaMA2-70B. Our training corpus mainly consists of academic papers, thesis, content from some academic domain, high-quality Chinese data and others. While it may not be extensive in data scale, AcademicGPT marks our initial venture into a domain-specific GPT tailored for research area. We evaluate AcademicGPT on several established public benchmarks such as MMLU and CEval, as well as on some specialized academic benchmarks like PubMedQA, SCIEval, and our newly-created ComputerScienceQA, to demonstrate its ability from general knowledge ability, to Chinese ability, and to academic ability. Building upon AcademicGPT's foundation model, we also developed several applications catered to the academic area, including General Academic Question Answering, AI-assisted Paper Reading, Paper Review, and AI-assisted Title and Abstract Generation.

翻译：大型语言模型（LLMs）在多种自然语言处理任务中已展现出卓越能力。然而，这些先进的大语言模型大多面向通用的、泛化的应用场景。在本技术报告中，我们介绍AcademicGPT——一个专为赋能学术研究而设计的模型。AcademicGPT是基于LLaMA2-70B的持续训练模型，其训练语料主要包含学术论文、学位论文、部分学术领域内容、高质量中文数据及其他语料。尽管在数据规模上可能并不庞大，但AcademicGPT标志着我们首次尝试构建面向研究领域的专用GPT模型。我们分别在MMLU和CEval等多个权威公开基准，以及PubMedQA、SCIEval和自建数据集ComputerScienceQA等学术专用基准上对AcademicGPT进行评估，以验证其在通用知识能力、中文能力及学术能力方面的表现。基于AcademicGPT的基座模型，我们还开发了多项面向学术领域的应用功能，包括通用学术问答、AI辅助论文阅读、论文评审及AI辅助标题与摘要生成。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日