Leveraging Large Language Models for Semantic Query Processing in a Scholarly Knowledge Graph

The proposed research aims to develop an innovative semantic query processing system that enables users to obtain comprehensive information about research works produced by Computer Science (CS) researchers at the Australian National University (ANU). The system integrates Large Language Models (LLMs) with the ANU Scholarly Knowledge Graph (ASKG), a structured repository of all research-related artifacts produced at ANU in the CS field. Each artifact and its parts are represented as textual nodes stored in a Knowledge Graph (KG). To address the limitations of traditional scholarly KG construction and utilization methods, which often fail to capture fine-grained details, we propose a novel framework that integrates the Deep Document Model (DDM) for comprehensive document representation and the KG-enhanced Query Processing (KGQP) for optimized complex query handling. DDM enables a fine-grained representation of the hierarchical structure and semantic relationships within academic papers, while KGQP leverages the KG structure to improve query accuracy and efficiency with LLMs. By combining the ASKG with LLMs, our approach enhances knowledge utilization and natural language understanding capabilities. The proposed system employs an automatic LLM-SPARQL fusion to retrieve relevant facts and textual nodes from the ASKG. Initial experiments demonstrate that our framework is superior to baseline methods in terms of accuracy retrieval and query efficiency. We showcase the practical application of our framework in academic research scenarios, highlighting its potential to revolutionize scholarly knowledge management and discovery. This work empowers researchers to acquire and utilize knowledge from documents more effectively and provides a foundation for developing precise and reliable interactions with LLMs.

翻译：本研究旨在开发一种创新的语义查询处理系统，使用户能够获取关于澳大利亚国立大学计算机科学研究人员所产出的研究工作的全面信息。该系统将大型语言模型与澳大利亚国立大学学术知识图谱（一种结构化的知识库，收录了澳大利亚国立大学在计算机科学领域产生的所有研究相关成果）相集成。每个成果及其组成部分均以文本节点的形式存储在知识图谱中。针对传统学术知识图谱构建与利用方法往往无法捕捉细粒度细节的局限性，我们提出了一种新颖框架，该框架集成了用于全面文档表示的深度文档模型以及用于优化复杂查询处理的知识图谱增强查询处理模块。深度文档模型能够对学术论文内部的层次结构和语义关系进行细粒度表示，而知识图谱增强查询处理则利用知识图谱结构，结合大型语言模型提高查询的准确性和效率。通过将澳大利亚国立大学学术知识图谱与大型语言模型相结合，我们的方法增强了知识利用和自然语言理解能力。所提出的系统采用自动化的LLM-SPARQL融合技术，从澳大利亚国立大学学术知识图谱中检索相关事实和文本节点。初步实验表明，我们的框架在检索准确性和查询效率方面均优于基线方法。我们展示了该框架在学术研究场景中的实际应用，突显了其革新学术知识管理与发现的潜力。这项工作使研究人员能够更有效地从文档中获取和利用知识，并为开发与大型语言模型精确可靠的交互奠定了基础。