In this paper, we investigate the in-context learning ability of retrieval-augmented encoder-decoder language models. We first conduct a comprehensive analysis of existing models and identify their limitations in in-context learning, primarily due to a mismatch between pretraining and inference, as well as a restricted context length. To address these issues, we propose RAVEN, a model that combines retrieval-augmented masked language modeling and prefix language modeling. We further introduce Fusion-in-Context Learning to enhance the few-shot performance by enabling the model to leverage more in-context examples without requiring additional training. Through extensive experiments, we demonstrate that our simple yet effective design significantly improves performance, achieving results comparable to the most advanced language models in certain scenarios, despite having substantially fewer parameters. Our work underscores the potential of retrieval-augmented encoder-decoder language models for in-context learning and encourages further research in this direction.
翻译:本文研究了检索增强编码器-解码器语言模型的上下文学习能力。我们首先对现有模型进行了全面分析,发现其在上下文学习中存在局限性,主要源于预训练与推理阶段的不匹配以及上下文长度受限。为解决这些问题,我们提出了RAVEN模型,该模型结合了检索增强掩码语言建模与前缀语言建模。我们进一步引入上下文融合学习技术,使模型能够在不增加训练负担的情况下利用更多上下文示例,从而提升小样本学习性能。通过大量实验验证,我们证明这一简洁而有效的设计显著提升了模型性能,在部分场景下取得了与最先进语言模型相当的结果,同时模型参数量大幅减少。本研究揭示了检索增强编码器-解码器语言模型在上下文学习中的潜力,并为该方向的后续研究提供了启示。