Retrieval-Augmented Language Modeling (RALM) methods, which condition a language model (LM) on relevant documents from a grounding corpus during generation, were shown to significantly improve language modeling performance. In addition, they can mitigate the problem of factually inaccurate text generation and provide natural source attribution mechanism. Existing RALM approaches focus on modifying the LM architecture in order to facilitate the incorporation of external information, significantly complicating deployment. This paper considers a simple alternative, which we dub In-Context RALM: leaving the LM architecture unchanged and prepending grounding documents to the input, without any further training of the LM. We show that In-Context RALM that builds on off-the-shelf general purpose retrievers provides surprisingly large LM gains across model sizes and diverse corpora. We also demonstrate that the document retrieval and ranking mechanism can be specialized to the RALM setting to further boost performance. We conclude that In-Context RALM has considerable potential to increase the prevalence of LM grounding, particularly in settings where a pretrained LM must be used without modification or even via API access.
翻译:检索增强语言建模(RALM)方法通过在生成过程中使语言模型(LM)依赖来自基础语料库的相关文档进行条件约束,显著提升了语言建模的性能。此外,该方法还能缓解事实性不准确文本生成的问题,并提供自然的来源归属机制。现有RALM方法侧重于修改LM架构以促进外部信息的整合,这使得部署过程显著复杂化。本文考虑一种简单的替代方案,我们称之为上下文中RALM:保持LM架构不变,将基础文档前置到输入中,无需对LM进行额外训练。我们证明,基于现成通用检索器构建的上下文中RALM,能在不同模型规模和多样语料库上带来出乎意料的巨大LM性能提升。我们还展示了文档检索与排序机制可针对RALM场景进行专门优化,以进一步提升性能。我们得出结论:上下文中RALM在增加LM基础依赖的普及度方面具有巨大潜力,特别是在必须使用预训练LM且无法修改甚至只能通过API访问的场景中。