Active Retrieval Augmented Generation

Despite the remarkable ability of large language models (LMs) to comprehend and generate language, they have a tendency to hallucinate and create factually inaccurate output. Augmenting LMs by retrieving information from external knowledge resources is one promising solution. Most existing retrieval-augmented LMs employ a retrieve-and-generate setup that only retrieves information once based on the input. This is limiting, however, in more general scenarios involving generation of long texts, where continually gathering information throughout the generation process is essential. There have been some past efforts to retrieve information multiple times while generating outputs, which mostly retrieve documents at fixed intervals using the previous context as queries. In this work, we provide a generalized view of active retrieval augmented generation, methods that actively decide when and what to retrieve across the course of the generation. We propose Forward-Looking Active REtrieval augmented generation (FLARE), a generic retrieval-augmented generation method which iteratively uses a prediction of the upcoming sentence to anticipate future content, which is then utilized as a query to retrieve relevant documents to regenerate the sentence if it contains low-confidence tokens. We test FLARE along with baselines comprehensively over 4 long-form knowledge-intensive generation tasks/datasets. FLARE achieves superior or competitive performance on all tasks, demonstrating the effectiveness of our method. Code and datasets are available at https://github.com/jzbjyb/FLARE.

翻译：尽管大语言模型在理解和生成语言方面表现出卓越能力，但它们仍倾向于产生幻觉并生成事实不准确的输出。通过从外部知识资源检索信息来增强语言模型是一种有前景的解决方案。现有大多数检索增强型语言模型采用"检索-生成"模式，仅基于输入进行一次信息检索。然而，这种方法在需要生成长篇文本的更通用场景中具有局限性，因为在此类场景中，持续收集生成过程中的信息至关重要。过去已有一些尝试在生成输出时多次检索信息的研究，这些工作主要基于先前上下文作为查询，按固定间隔检索文档。本研究提出了主动检索增强生成的广义视角，即主动决定在生成过程中何时检索以及检索何种信息的方法。我们提出前瞻性主动检索增强生成（FLARE），这是一种通用的检索增强生成方法，该方法迭代地利用对即将生成句子的预测来预判未来内容，当该句子包含低置信度词元时，将其作为查询检索相关文档以重新生成该句子。我们在4个长文本知识密集型生成任务/数据集上对FLARE及基线方法进行了全面测试。FLARE在所有任务上均实现了优越或具有竞争力的性能，充分验证了我们方法的有效性。代码和数据集可在https://github.com/jzbjyb/FLARE 获取。