Leveraging Large Language Models for Semantic Query Processing in a Scholarly Knowledge Graph

The proposed research aims to develop an innovative semantic query processing system that enables users to obtain comprehensive information about research works produced by Computer Science (CS) researchers at the Australian National University (ANU). The system integrates Large Language Models (LLMs) with the ANU Scholarly Knowledge Graph (ASKG), a structured repository of all research-related artifacts produced at ANU in the CS field. Each artifact and its parts are represented as textual nodes stored in a Knowledge Graph (KG). To address the limitations of traditional scholarly KG construction and utilization methods, which often fail to capture fine-grained details, we propose a novel framework that integrates the Deep Document Model (DDM) for comprehensive document representation and the KG-enhanced Query Processing (KGQP) for optimized complex query handling. DDM enables a fine-grained representation of the hierarchical structure and semantic relationships within academic papers, while KGQP leverages the KG structure to improve query accuracy and efficiency with LLMs. By combining the ASKG with LLMs, our approach enhances knowledge utilization and natural language understanding capabilities. The proposed system employs an automatic LLM-SPARQL fusion to retrieve relevant facts and textual nodes from the ASKG. Initial experiments demonstrate that our framework is superior to baseline methods in terms of accuracy retrieval and query efficiency. We showcase the practical application of our framework in academic research scenarios, highlighting its potential to revolutionize scholarly knowledge management and discovery. This work empowers researchers to acquire and utilize knowledge from documents more effectively and provides a foundation for developing precise and reliable interactions with LLMs.

翻译：本研究旨在开发一种创新的语义查询处理系统，使用户能够全面获取澳大利亚国立大学计算机科学领域研究人员所产出的研究成果信息。该系统将大型语言模型与ANU学术知识图谱相融合——该知识图谱是澳大利亚国立大学计算机科学领域所有研究产物的结构化存储库。每个研究产物及其组成部分均以文本节点的形式存储在知识图谱中。针对传统学术知识图谱构建与利用方法在细粒度细节捕捉方面的不足，我们提出了一个新颖框架：该框架集成了用于文档全面表征的深度文档模型，以及用于优化复杂查询处理的知识图谱增强查询处理机制。深度文档模型能够对学术论文内部的层次结构与语义关系进行细粒度表征，而知识图谱增强查询处理则利用知识图谱结构，结合大型语言模型提升查询的准确性与效率。通过将ANU学术知识图谱与大型语言模型相结合，我们的方法增强了知识利用效率与自然语言理解能力。所提出的系统采用自动化的大型语言模型-SPARQL融合技术，从ANU学术知识图谱中检索相关事实与文本节点。初步实验表明，我们的框架在检索准确性与查询效率方面均优于基线方法。我们通过学术研究场景展示了该框架的实际应用，凸显了其在革新学术知识管理与发现方面的潜力。此项工作使研究人员能够更有效地从文档中获取和利用知识，并为开发与大型语言模型精确可靠的交互机制奠定了基础。