The `pre-train, prompt, predict' paradigm of large language models (LLMs) has achieved remarkable success in open-domain question answering (OD-QA). However, few works explore this paradigm in the scenario of multi-document question answering (MD-QA), a task demanding a thorough understanding of the logical associations among the contents and structures of different documents. To fill this crucial gap, we propose a Knowledge Graph Prompting (KGP) method to formulate the right context in prompting LLMs for MD-QA, which consists of a graph construction module and a graph traversal module. For graph construction, we create a knowledge graph (KG) over multiple documents with nodes symbolizing passages or document structures (e.g., pages/tables), and edges denoting the semantic/lexical similarity between passages or intra-document structural relations. For graph traversal, we design an LLM-based graph traversal agent that navigates across nodes and gathers supporting passages assisting LLMs in MD-QA. The constructed graph serves as the global ruler that regulates the transitional space among passages and reduces retrieval latency. Concurrently, the graph traversal agent acts as a local navigator that gathers pertinent context to progressively approach the question and guarantee retrieval quality. Extensive experiments underscore the efficacy of KGP for MD-QA, signifying the potential of leveraging graphs in enhancing the prompt design for LLMs. Our code: https://github.com/YuWVandy/KG-LLM-MDQA.
翻译:大型语言模型(LLMs)的“预训练-提示-预测”范式在开放域问答(OD-QA)中取得了显著成功。然而,在多文档问答(MD-QA)场景中,该范式的研究尚不充分——MD-QA任务要求深入理解不同文档内容与结构间的逻辑关联。为填补这一重要空白,我们提出知识图谱提示(KGP)方法,通过构建图结构模块和图遍历模块,为LLMs在MD-QA任务中生成合适的提示上下文。图构建模块基于多个文档创建知识图谱(KG),其中节点代表段落或文档结构(如页面/表格),边表示段落间的语义/词汇相似性或文档内部结构关系。图遍历模块设计基于LLM的图遍历代理,该代理在节点间导航并收集支持性段落以辅助LLMs完成MD-QA。构建的图作为全局调节器,规范段落间的转换空间并降低检索延迟。同时,图遍历代理作为局部导航器,逐步收集相关上下文以逼近问题并保证检索质量。大量实验验证了KGP方法在MD-QA中的有效性,彰显了利用图增强LLM提示设计的潜力。代码地址:https://github.com/YuWVandy/KG-LLM-MDQA。