Large Language Models (LLMs) have catalyzed significant advancements in Natural Language Processing (NLP), yet they encounter challenges such as hallucination and the need for domain-specific knowledge. To mitigate these, recent methodologies have integrated information retrieved from external resources with LLMs, substantially enhancing their performance across NLP tasks. This survey paper addresses the absence of a comprehensive overview on Retrieval-Augmented Language Models (RALMs), both Retrieval-Augmented Generation (RAG) and Retrieval-Augmented Understanding (RAU), providing an in-depth examination of their paradigm, evolution, taxonomy, and applications. The paper discusses the essential components of RALMs, including Retrievers, Language Models, and Augmentations, and how their interactions lead to diverse model structures and applications. RALMs demonstrate utility in a spectrum of tasks, from translation and dialogue systems to knowledge-intensive applications. The survey includes several evaluation methods of RALMs, emphasizing the importance of robustness, accuracy, and relevance in their assessment. It also acknowledges the limitations of RALMs, particularly in retrieval quality and computational efficiency, offering directions for future research. In conclusion, this survey aims to offer a structured insight into RALMs, their potential, and the avenues for their future development in NLP. The paper is supplemented with a Github Repository containing the surveyed works and resources for further study: https://github.com/2471023025/RALM_Survey.
翻译:大语言模型(LLMs)在自然语言处理(NLP)领域引发了显著进展,但仍面临幻觉现象及领域知识缺乏等挑战。为缓解这些问题,近期方法将外部资源检索到的信息与LLMs进行集成,显著提升了其在各类NLP任务中的性能。本综述论文针对检索增强语言模型(RALMs)——涵盖检索增强生成(RAG)与检索增强理解(RAU)——缺乏全面概述的现状,深入剖析其范式、演进、分类及应用。论文讨论了RALMs的核心组件(包括检索器、语言模型和增强模块),并阐述其交互如何衍生出多样的模型结构与应用场景。RALMs在翻译、对话系统乃至知识密集型任务等广泛领域展现出实用价值。本综述涵盖RALMs的多种评估方法,强调鲁棒性、准确性与相关性在其评估中的重要性。同时指出RALMs在检索质量与计算效率等方面的局限性,并提出未来研究方向。综上,本综述旨在为RALMs及其在NLP领域的发展潜力与未来路径提供结构化洞见。论文附有GitHub仓库(https://github.com/2471023025/RALM_Survey),包含所综述工作与资源供进一步研究参考。