Recent advancements in Large Language Models (LLMs) have showcased their proficiency in answering natural language queries. However, their effectiveness is hindered by limited domain-specific knowledge, raising concerns about the reliability of their responses. We introduce a hybrid system that augments LLMs with domain-specific knowledge graphs (KGs), thereby aiming to enhance factual correctness using a KG-based retrieval approach. We focus on a medical KG to demonstrate our methodology, which includes (1) pre-processing, (2) Cypher query generation, (3) Cypher query processing, (4) KG retrieval, and (5) LLM-enhanced response generation. We evaluate our system on a curated dataset of 69 samples, achieving a precision of 78\% in retrieving correct KG nodes. Our findings indicate that the hybrid system surpasses a standalone LLM in accuracy and completeness, as verified by an LLM-as-a-Judge evaluation method. This positions the system as a promising tool for applications that demand factual correctness and completeness, such as target identification -- a critical process in pinpointing biological entities for disease treatment or crop enhancement. Moreover, its intuitive search interface and ability to provide accurate responses within seconds make it well-suited for time-sensitive, precision-focused research contexts. We publish the source code together with the dataset and the prompt templates used.
翻译:近年来,大型语言模型(LLMs)在回答自然语言查询方面展现出了卓越的能力。然而,其有效性受限于领域特定知识的匮乏,这引发了对其回答可靠性的担忧。我们提出了一种混合系统,通过融入领域特定的知识图谱(KGs)来增强LLMs,旨在利用基于KG的检索方法来提高事实准确性。我们以一个医学KG为例来演示我们的方法,该方法包括(1)预处理,(2)Cypher查询生成,(3)Cypher查询处理,(4)KG检索,以及(5)LLM增强的回答生成。我们在一个包含69个样本的精选数据集上评估了我们的系统,在检索正确KG节点方面达到了78%的精确率。我们的研究结果表明,经LLM-as-a-Judge评估方法验证,该混合系统在准确性和完整性方面均优于独立的LLM。这使其成为需要事实正确性和完整性的应用(例如靶点识别——一种确定用于疾病治疗或作物改良的生物实体的关键过程)的一个有前景的工具。此外,其直观的搜索界面以及在数秒内提供准确回答的能力,使其非常适合时间敏感且注重精确性的研究场景。我们同时发布了源代码、所用数据集及提示模板。