Knowledge Base Question Answering (KBQA) aims to answer natural language questions over large-scale knowledge bases (KBs), which can be summarized into two crucial steps: knowledge retrieval and semantic parsing. However, three core challenges remain: inefficient knowledge retrieval, mistakes of retrieval adversely impacting semantic parsing, and the complexity of previous KBQA methods. To tackle these challenges, we introduce ChatKBQA, a novel and simple generate-then-retrieve KBQA framework, which proposes first generating the logical form with fine-tuned LLMs, then retrieving and replacing entities and relations with an unsupervised retrieval method, to improve both generation and retrieval more directly. Experimental results show that ChatKBQA achieves new state-of-the-art performance on standard KBQA datasets, WebQSP, and CWQ. This work can also be regarded as a new paradigm for combining LLMs with knowledge graphs (KGs) for interpretable and knowledge-required question answering. Our code is publicly available.
翻译:知识库问答(KBQA)旨在基于大规模知识库(KBs)回答自然语言问题,其核心可归纳为两个关键步骤:知识检索与语义解析。然而,该领域仍面临三大核心挑战:知识检索效率低下、检索错误对语义解析的负面影响,以及现有KBQA方法的复杂性。为应对这些挑战,我们提出了ChatKBQA——一种新颖而简洁的生成-检索式KBQA框架。该框架首先通过微调的大型语言模型生成逻辑形式,随后采用无监督检索方法对实体和关系进行检索与替换,从而更直接地提升生成与检索的效果。实验结果表明,ChatKBQA在标准KBQA数据集WebQSP和CWQ上取得了最新的最优性能。本工作亦可视为将大型语言模型与知识图谱(KGs)相结合以实现可解释、需知识支撑的问答任务的新范式。相关代码已公开。