Retrieval-Augmented Generation (RAG) systems enhance large language models (LLMs) by integrating external knowledge sources, enabling more accurate and contextually relevant responses tailored to user needs. However, existing RAG systems have significant limitations, including reliance on flat data representations and inadequate contextual awareness, which can lead to fragmented answers that fail to capture complex inter-dependencies. To address these challenges, we propose LightRAG, which incorporates graph structures into text indexing and retrieval processes. This innovative framework employs a dual-level retrieval system that enhances comprehensive information retrieval from both low-level and high-level knowledge discovery. Additionally, the integration of graph structures with vector representations facilitates efficient retrieval of related entities and their relationships, significantly improving response times while maintaining contextual relevance. This capability is further enhanced by an incremental update algorithm that ensures the timely integration of new data, allowing the system to remain effective and responsive in rapidly changing data environments. Extensive experimental validation demonstrates considerable improvements in retrieval accuracy and efficiency compared to existing approaches. We have made our LightRAG open-source and available at the link: https://github.com/HKUDS/LightRAG.
翻译:检索增强生成(RAG)系统通过整合外部知识源来增强大型语言模型(LLMs),从而能够根据用户需求生成更准确且上下文相关的响应。然而,现有的RAG系统存在显著局限性,包括依赖扁平化的数据表示和上下文感知不足,这可能导致生成的答案碎片化,难以捕捉复杂的相互依赖关系。为应对这些挑战,我们提出了LightRAG,它将图结构引入文本索引和检索过程。这一创新框架采用双层检索系统,从低层级和高层级知识发现中增强综合信息检索能力。此外,图结构与向量表示的融合促进了相关实体及其关系的高效检索,在保持上下文相关性的同时显著提升了响应速度。该能力通过增量更新算法得到进一步增强,该算法确保新数据能够及时整合,使系统在快速变化的数据环境中保持高效响应。大量实验验证表明,与现有方法相比,LightRAG在检索准确性和效率方面均有显著提升。我们已将LightRAG开源,项目地址为:https://github.com/HKUDS/LightRAG。