Way to Specialist: Closing Loop Between Specialized LLM and Evolving Domain Knowledge Graph

Large language models (LLMs) have demonstrated exceptional performance across a wide variety of domains. Nonetheless, generalist LLMs continue to fall short in reasoning tasks necessitating specialized knowledge. Prior investigations into specialized LLMs focused on domain-specific training, which entails substantial efforts in domain data acquisition and model parameter fine-tuning. To address these challenges, this paper proposes the Way-to-Specialist (WTS) framework, which synergizes retrieval-augmented generation with knowledge graphs (KGs) to enhance the specialized capability of LLMs in the absence of specialized training. In distinction to existing paradigms that merely utilize external knowledge from general KGs or static domain KGs to prompt LLM for enhanced domain-specific reasoning, WTS proposes an innovative "LLM$\circlearrowright$KG" paradigm, which achieves bidirectional enhancement between specialized LLM and domain knowledge graph (DKG). The proposed paradigm encompasses two closely coupled components: the DKG-Augmented LLM and the LLM-Assisted DKG Evolution. The former retrieves question-relevant domain knowledge from DKG and uses it to prompt LLM to enhance the reasoning capability for domain-specific tasks; the latter leverages LLM to generate new domain knowledge from processed tasks and use it to evolve DKG. WTS closes the loop between DKG-Augmented LLM and LLM-Assisted DKG Evolution, enabling continuous improvement in the domain specialization as it progressively answers and learns from domain-specific questions. We validate the performance of WTS on 6 datasets spanning 5 domains. The experimental results show that WTS surpasses the previous SOTA in 4 specialized domains and achieves a maximum performance improvement of 11.3%.

翻译：大语言模型（LLM）已在众多领域展现出卓越性能。然而，通用型大语言模型在需要专业知识的推理任务中仍存在不足。以往针对专用大语言模型的研究主要集中于领域特定训练，这需要在领域数据获取和模型参数微调方面投入大量精力。为应对这些挑战，本文提出"通往专家之路"（WTS）框架，该框架将检索增强生成与知识图谱（KG）协同结合，以在不进行专门训练的情况下增强大语言模型的领域专用能力。区别于现有仅利用通用知识图谱或静态领域知识图谱的外部知识来提示大语言模型以增强领域特定推理的范式，WTS提出了一种创新的"LLM$\circlearrowright$KG"范式，实现了专用大语言模型与领域知识图谱（DKG）之间的双向增强。该范式包含两个紧密耦合的组件：DKG增强的大语言模型与大语言模型辅助的DKG演化。前者从DKG中检索问题相关的领域知识，并利用其提示大语言模型以增强领域特定任务的推理能力；后者利用大语言模型从已处理任务中生成新的领域知识，并用于演化DKG。WTS在DKG增强的大语言模型与大语言模型辅助的DKG演化之间建立了闭环，使其能够在持续解答领域特定问题并从中学习的过程中，实现领域专用能力的持续提升。我们在涵盖5个领域的6个数据集上验证了WTS的性能。实验结果表明，WTS在4个专业领域超越了先前的SOTA，并实现了最高达11.3%的性能提升。