We introduce the Extract-Refine-Retrieve-Read (ERRR) framework, a novel approach designed to bridge the pre-retrieval information gap in Retrieval-Augmented Generation (RAG) systems through query optimization tailored to meet the specific knowledge requirements of Large Language Models (LLMs). Unlike conventional query optimization techniques used in RAG, the ERRR framework begins by extracting parametric knowledge from LLMs, followed by using a specialized query optimizer for refining these queries. This process ensures the retrieval of only the most pertinent information essential for generating accurate responses. Moreover, to enhance flexibility and reduce computational costs, we propose a trainable scheme for our pipeline that utilizes a smaller, tunable model as the query optimizer, which is refined through knowledge distillation from a larger teacher model. Our evaluations on various question-answering (QA) datasets and with different retrieval systems show that ERRR consistently outperforms existing baselines, proving to be a versatile and cost-effective module for improving the utility and accuracy of RAG systems.
翻译:我们提出了提取-精炼-检索-读取(ERRR)框架,这是一种新颖方法,旨在通过针对大语言模型(LLM)特定知识需求定制的查询优化,来弥合检索增强生成(RAG)系统中检索前的信息鸿沟。与RAG中使用的传统查询优化技术不同,ERRR框架首先从LLM中提取参数化知识,然后使用专门的查询优化器对这些查询进行精炼。这一过程确保仅检索对生成准确响应至关重要的最相关信息。此外,为提高灵活性并降低计算成本,我们提出了一种可训练方案,该方案使用一个较小的、可调优的模型作为查询优化器,并通过从更大的教师模型进行知识蒸馏来精炼该优化器。我们在多种问答(QA)数据集和不同检索系统上的评估表明,ERRR始终优于现有基线,证明其是提升RAG系统实用性和准确性的一个通用且经济高效的模块。