Retrieval Augmented Generation (RAG) has introduced a new paradigm for Large Language Models (LLMs), aiding in the resolution of knowledge-intensive tasks. However, current RAG models position LLMs as passive knowledge receptors, thereby restricting their capacity for learning and comprehending external knowledge. In this paper, we present ActiveRAG, an innovative RAG framework that shifts from passive knowledge acquisition to an active learning mechanism. This approach utilizes the Knowledge Construction mechanism to develop a deeper understanding of external knowledge by associating it with previously acquired or memorized knowledge. Subsequently, it designs the Cognitive Nexus mechanism to incorporate the outcomes from both chains of thought and knowledge construction, thereby calibrating the intrinsic cognition of LLMs. Our experimental results demonstrate that ActiveRAG surpasses previous RAG models, achieving a 5% improvement on question-answering datasets. All data and codes are available at https://github.com/OpenMatch/ActiveRAG.
翻译:检索增强生成(RAG)为大语言模型(LLMs)引入了一种新的范式,有助于解决知识密集型任务。然而,当前的RAG模型将LLMs定位为被动的知识接收者,从而限制其学习和理解外部知识的能力。本文提出了ActiveRAG,一种创新的RAG框架,它将被动的知识获取转变为主动学习机制。该方法利用知识构建机制,通过将外部知识与先前获取或记忆的知识相关联,来深化对外部知识的理解。随后,通过设计认知联结机制,融合思维链与知识构建的成果,从而校准LLMs的内在认知。实验结果表明,ActiveRAG超越了以往的RAG模型,在问答数据集上实现了5%的性能提升。所有数据和代码均可在https://github.com/OpenMatch/ActiveRAG获取。