Retrieval-Augmented Language Modeling (RALM) methods, that condition a language model (LM) on relevant documents from a grounding corpus during generation, have been shown to significantly improve language modeling while also providing a natural source attribution mechanism. Existing RALM approaches focus on modifying the LM architecture in order to facilitate the incorporation of external information, significantly complicating deployment. This paper proposes an under-explored alternative, which we dub In-Context RALM: leaving the LM architecture unchanged and prepending grounding documents to the input. We show that in-context RALM which uses off-the-shelf general purpose retrievers provides surprisingly large LM gains across model sizes and diverse corpora. We also demonstrate that the document retrieval and ranking mechanism can be specialized to the RALM setting to further boost performance. We conclude that in-context RALM has considerable potential to increase the prevalence of LM grounding, particularly in settings where a pretrained LM must be used without modification or even via API access. To that end, we make our code publicly available.
翻译:检索增强语言建模(RALM)方法在生成过程中将语言模型(LM)条件化于来自基础语料库的相关文档,已被证明能显著改进语言建模,同时提供自然的归因机制。现有RALM方法侧重于修改LM架构以促进外部信息的整合,这大大复杂化了部署。本文提出一种未被充分探索的替代方案,即我们称之为上下文RALM:保持LM架构不变,将基础文档附加到输入中。我们表明,使用现成通用检索器的上下文RALM能在不同模型规模和多样语料库上带来惊人的LM性能提升。我们还证明,文档检索和排序机制可针对RALM设置进行专门优化以进一步提升性能。我们得出结论,上下文RALM在增加LM基础化普及率方面具有巨大潜力,尤其适用于必须使用未经修改的预训练LM甚至通过API访问的场景。为此,我们公开了代码。