Empowering Large Language Models to Set up a Knowledge Retrieval Indexer via Self-Learning

Retrieval-Augmented Generation (RAG) offers a cost-effective approach to injecting real-time knowledge into large language models (LLMs). Nevertheless, constructing and validating high-quality knowledge repositories require considerable effort. We propose a pre-retrieval framework named Pseudo-Graph Retrieval-Augmented Generation (PG-RAG), which conceptualizes LLMs as students by providing them with abundant raw reading materials and encouraging them to engage in autonomous reading to record factual information in their own words. The resulting concise, well-organized mental indices are interconnected through common topics or complementary facts to form a pseudo-graph database. During the retrieval phase, PG-RAG mimics the human behavior in flipping through notes, identifying fact paths and subsequently exploring the related contexts. Adhering to the principle of the path taken by many is the best, it integrates highly corroborated fact paths to provide a structured and refined sub-graph assisting LLMs. We validated PG-RAG on three specialized question-answering datasets. In single-document tasks, PG-RAG significantly outperformed the current best baseline, KGP-LLaMA, across all key evaluation metrics, with an average overall performance improvement of 11.6%. Specifically, its BLEU score increased by approximately 14.3%, and the QE-F1 metric improved by 23.7%. In multi-document scenarios, the average metrics of PG-RAG were at least 2.35% higher than the best baseline. Notably, the BLEU score and QE-F1 metric showed stable improvements of around 7.55% and 12.75%, respectively. Our code: https://github.com/IAAR-Shanghai/PGRAG.

翻译：检索增强生成（RAG）为向大语言模型（LLMs）注入实时知识提供了一种经济高效的方法。然而，构建和验证高质量知识库需要大量人力。我们提出了一种名为伪图检索增强生成（PG-RAG）的预检索框架，该框架将LLMs概念化为学生，为其提供丰富的原始阅读材料，并鼓励其进行自主阅读，以自身语言记录事实信息。由此产生的简洁、组织良好的心智索引通过共同主题或互补事实相互连接，形成一个伪图数据库。在检索阶段，PG-RAG模拟人类翻阅笔记的行为，识别事实路径并随后探索相关上下文。遵循“众人所行之路即为最佳”的原则，它整合了高度佐证的事实路径，为LLMs提供结构化且精炼的子图。我们在三个专业问答数据集上验证了PG-RAG。在单文档任务中，PG-RAG在所有关键评估指标上均显著优于当前最佳基线KGP-LLaMA，整体平均性能提升达11.6%。具体而言，其BLEU分数提升了约14.3%，QE-F1指标改善了23.7%。在多文档场景下，PG-RAG的平均指标至少比最佳基线高出2.35%。值得注意的是，BLEU分数和QE-F1指标分别稳定提升了约7.55%和12.75%。我们的代码：https://github.com/IAAR-Shanghai/PGRAG。