Large language models (LLMs), while exhibiting exceptional performance, suffer from hallucinations, especially on knowledge-intensive tasks. Existing works propose to augment LLMs with individual text units retrieved from external knowledge corpora to alleviate the issue. However, in many domains, texts are interconnected (e.g., academic papers in a bibliographic graph are linked by citations and co-authorships) which form a (text-attributed) graph. The knowledge in such graphs is encoded not only in single texts/nodes but also in their associated connections. To facilitate the research of augmenting LLMs with graphs, we manually construct a Graph Reasoning Benchmark dataset called GRBench, containing 1,740 questions that can be answered with the knowledge from 10 domain graphs. Then, we propose a simple and effective framework called Graph Chain-of-thought (Graph-CoT) to augment LLMs with graphs by encouraging LLMs to reason on the graph iteratively. Each Graph-CoT iteration consists of three sub-steps: LLM reasoning, LLM-graph interaction, and graph execution. We conduct systematic experiments with three LLM backbones on GRBench, where Graph-CoT outperforms the baselines consistently. The code is available at https://github.com/PeterGriffinJin/Graph-CoT.
翻译:大语言模型(LLMs)虽然展现出卓越的性能,但在知识密集型任务上容易产生幻觉。现有研究提出通过从外部知识库中检索独立的文本单元来增强LLMs以缓解此问题。然而,在许多领域中,文本是相互关联的(例如,文献引用图中的学术论文通过引用和合著关系相互连接),从而形成(文本属性)图。此类图中的知识不仅编码在单个文本/节点中,也蕴含在其关联的连接关系中。为促进利用图结构增强LLMs的研究,我们手动构建了一个名为GRBench的图推理基准数据集,包含1,740个可通过10个领域图的知识回答的问题。随后,我们提出了一种简单有效的框架——图链式思维(Graph-CoT),通过促使LLMs在图结构上进行迭代推理来增强其能力。每次Graph-CoT迭代包含三个子步骤:LLM推理、LLM-图交互和图执行。我们在GRBench上使用三种LLM骨干模型进行了系统实验,结果表明Graph-CoT持续优于基线方法。代码已发布于https://github.com/PeterGriffinJin/Graph-CoT。