The 'pre-train, prompt, predict' paradigm of large language models (LLMs) has achieved remarkable success in open-domain question answering (OD-QA). However, few works explore this paradigm in the scenario of multi-document question answering (MD-QA), a task demanding a thorough understanding of the logical associations among the contents and structures of different documents. To fill this crucial gap, we propose a Knowledge Graph Prompting (KGP) method to formulate the right context in prompting LLMs for MD-QA, which consists of a graph construction module and a graph traversal module. For graph construction, we create a knowledge graph (KG) over multiple documents with nodes symbolizing passages or document structures (e.g., pages/tables), and edges denoting the semantic/lexical similarity between passages or intra-document structural relations. For graph traversal, we design an LM-guided graph traverser that navigates across nodes and gathers supporting passages assisting LLMs in MD-QA. The constructed graph serves as the global ruler that regulates the transitional space among passages and reduces retrieval latency. Concurrently, the LM-guided traverser acts as a local navigator that gathers pertinent context to progressively approach the question and guarantee retrieval quality. Extensive experiments underscore the efficacy of KGP for MD-QA, signifying the potential of leveraging graphs in enhancing the prompt design for LLMs. Our code is at https://github.com/YuWVandy/KG-LLM-MDQA.
翻译:大型语言模型(LLMs)的“预训练-提示-预测”范式在开放域问答(OD-QA)中取得了显著成功。然而,在多文档问答(MD-QA)场景中,少有工作探索这一范式——该任务要求深入理解不同文档内容与结构间的逻辑关联。为填补这一关键空白,我们提出知识图谱提示(KGP)方法,为LLMs的MD-QA任务构建合适的提示上下文,该方法包含图构建模块和图遍历模块。在图构建中,我们跨多个文档构建知识图谱(KG),其中节点代表段落或文档结构(如页面/表格),边表示段落间的语义/词汇相似度或文档内结构关系。在图遍历中,我们设计了一种由语言模型引导的图遍历器,可跨节点导航并收集支撑段落以辅助LLMs完成MD-QA。所构建的图作为全局规则器,调节段落间的转移空间并降低检索延迟。同时,LM引导的遍历器作为局部导航器,收集相关上下文以逐步逼近问题并保证检索质量。大量实验验证了KGP在MD-QA任务中的有效性,彰显了利用图增强LLMs提示设计的潜力。我们的代码开源在https://github.com/YuWVandy/KG-LLM-MDQA。