The `pre-train, prompt, predict' paradigm of large language models (LLMs) has achieved remarkable success in open-domain question answering (OD-QA). However, few works explore this paradigm in the scenario of multi-document question answering (MD-QA), a task demanding a thorough understanding of the logical associations among the contents and structures of different documents. To fill this crucial gap, we propose a Knowledge Graph Prompting (KGP) method to formulate the right context in prompting LLMs for MD-QA, which consists of a graph construction module and a graph traversal module. For graph construction, we create a knowledge graph (KG) over multiple documents with nodes symbolizing passages or document structures (e.g., pages/tables), and edges denoting the semantic/lexical similarity between passages or intra-document structural relations. For graph traversal, we design an LLM-based graph traversal agent that navigates across nodes and gathers supporting passages assisting LLMs in MD-QA. The constructed graph serves as the global ruler that regulates the transitional space among passages and reduces retrieval latency. Concurrently, the graph traversal agent acts as a local navigator that gathers pertinent context to progressively approach the question and guarantee retrieval quality. Extensive experiments underscore the efficacy of KGP for MD-QA, signifying the potential of leveraging graphs in enhancing the prompt design for LLMs. Our code: https://github.com/YuWVandy/KG-LLM-MDQA.
翻译:大语言模型(LLMs)的“预训练-提示-预测”范式在开放域问答(OD-QA)中取得了显著成功。然而,在多文档问答(MD-QA)场景中,该范式的研究仍较为匮乏——MD-QA任务要求深入理解不同文档内容与结构间的逻辑关联。为填补这一关键空白,我们提出知识图谱提示(KGP)方法,旨在为LLMs的MD-QA任务构建恰当的提示上下文。该方法包含图构建模块与图遍历模块。在图构建阶段,我们在多篇文档上构建知识图谱(KG),其中节点表示段落或文档结构(如页面/表格),边表示段落间的语义/词汇相似性或文档内部的结构关系。在图遍历阶段,我们设计基于LLM的图遍历代理,该代理可在节点间导航并收集支撑段落以辅助LLMs完成MD-QA任务。构建的图谱作为全局调节器,规范段落间的过渡空间并降低检索延迟;同时,图遍历代理充当局部导航器,汇聚相关上下文以渐进式逼近问题并保障检索质量。大量实验证明了KGP在MD-QA任务中的有效性,彰显了利用图谱增强LLMs提示设计的潜力。我们的代码:https://github.com/YuWVandy/KG-LLM-MDQA。