RAG与GraphRAG：系统评估与关键洞见 (RAG vs. GraphRAG: A Systematic Evaluation and Key Insights)

Retrieval-Augmented Generation (RAG) enhances the performance of LLMs across various tasks by retrieving relevant information from external sources, particularly on text-based data. For structured data, such as knowledge graphs, GraphRAG has been widely used to retrieve relevant information. However, recent studies have revealed that structuring implicit knowledge from text into graphs can benefit certain tasks, extending the application of GraphRAG from graph data to general text-based data. Despite their successful extensions, most applications of GraphRAG for text data have been designed for specific tasks and datasets, lacking a systematic evaluation and comparison between RAG and GraphRAG on widely used text-based benchmarks. In this paper, we systematically evaluate RAG and GraphRAG on well-established benchmark tasks, such as Question Answering and Query-based Summarization. Our results highlight the distinct strengths of RAG and GraphRAG across different tasks and evaluation perspectives. Inspired by these observations, we investigate strategies to integrate their strengths to improve downstream tasks. Additionally, we provide an in-depth discussion of the shortcomings of current GraphRAG approaches and outline directions for future research.

翻译：检索增强生成（RAG）通过从外部源检索相关信息来提升大语言模型（LLM）在各类任务中的性能，尤其在文本数据上表现显著。对于结构化数据（如知识图谱），GraphRAG已被广泛用于检索相关信息。然而，近期研究表明，将文本中的隐式知识结构化构建为图谱可有益于特定任务，从而将GraphRAG的应用从图谱数据扩展至通用文本数据。尽管这些扩展取得了成功，但当前大多数针对文本数据的GraphRAG应用均针对特定任务和数据集设计，缺乏在广泛使用的文本基准上对RAG与GraphRAG进行系统评估与比较。本文在问答和基于查询的摘要等成熟基准任务上，对RAG与GraphRAG进行了系统评估。我们的结果揭示了RAG与GraphRAG在不同任务和评估维度上的独特优势。基于这些发现，我们探索了整合两者优势以提升下游任务的策略。此外，我们深入讨论了当前GraphRAG方法的不足，并展望了未来研究方向。