This paper introduces AnalyticsGPT, an intuitive and efficient large language model (LLM)-powered workflow for scientometric question answering. This underrepresented downstream task addresses the subcategory of meta-scientific questions concerning the "science of science." When compared to traditional scientific question answering based on papers, the task poses unique challenges in the planning phase. Namely, the need for named-entity recognition of academic entities within questions and multi-faceted data retrieval involving scientometric indices, e.g. impact factors. Beyond their exceptional capacity for treating traditional natural language processing tasks, LLMs have shown great potential in more complex applications, such as task decomposition and planning and reasoning. In this paper, we explore the application of LLMs to scientometric question answering, and describe an end-to-end system implementing a sequential workflow with retrieval-augmented generation and agentic concepts. We also address the secondary task of effectively synthesizing the data into presentable and well-structured high-level analyses. As a database for retrieval-augmented generation, we leverage a proprietary research performance assessment platform. For evaluation, we consult experienced subject matter experts and leverage LLMs-as-judges. In doing so, we provide valuable insights on the efficacy of LLMs towards a niche downstream task. Our (skeleton) code and prompts are available at: https://github.com/lyvykhang/llm-agents-scientometric-qa/tree/acl.
翻译:本文介绍AnalyticsGPT,一种直观高效的大语言模型驱动工作流,用于科学计量问答。这一代表性不足的下游任务针对“科学的科学”这一元科学问题的子类别。与基于论文的传统科学问答相比,该任务在规划阶段提出了独特挑战,即需要对问题中的学术实体进行命名实体识别,并涉及科学计量指标(如影响因子)的多维度数据检索。大语言模型除了在处理传统自然语言处理任务方面具有卓越能力外,在更复杂的应用(如任务分解、规划与推理)中也展现出巨大潜力。本文探索了大语言模型在科学计量问答中的应用,描述了一个端到端系统,该系统实现了结合检索增强生成与智能体概念的序列化工作流。我们还解决了将数据有效合成为可呈现且结构良好的高层分析这一次要任务。作为检索增强生成的数据库,我们利用了一个专有的科研绩效评估平台。在评估方面,我们咨询了经验丰富的领域专家并采用大语言模型作为评判者。通过这项工作,我们为大语言模型在特定下游任务中的效能提供了有价值的见解。我们的(框架)代码与提示词发布于:https://github.com/lyvykhang/llm-agents-scientometric-qa/tree/acl。