In this paper, we investigate the in-context learning ability of retrieval-augmented encoder-decoder language models. We first conduct a comprehensive analysis of the state-of-the-art ATLAS model and identify its limitations in in-context learning, primarily due to a mismatch between pretraining and testing, as well as a restricted context length. To address these issues, we propose RAVEN, a model that combines retrieval-augmented masked language modeling and prefix language modeling. We further introduce Fusion-in-Context Learning to enhance the few-shot performance by enabling the model to leverage more in-context examples without requiring additional training or model modifications. Through extensive experiments, we demonstrate that RAVEN significantly outperforms ATLAS and achieves results comparable to the most advanced language models in certain scenarios, despite having substantially fewer parameters. Our work underscores the potential of retrieval-augmented encoder-decoder language models for in-context learning and encourages further research in this direction.
翻译:在本文中,我们研究了检索增强型编码器-解码器语言模型的上下文学习能力。我们首先对最先进的ATLAS模型进行了全面分析,并识别出其在上下文学习中的局限性,主要源自预训练与测试之间的不匹配以及受限的上下文长度。为解决这些问题,我们提出了RAVEN模型,该模型结合了检索增强型掩码语言建模和前缀语言建模。我们进一步引入了上下文融合学习(Fusion-in-Context Learning),通过使模型能够利用更多上下文示例来提升少样本性能,而无需额外训练或修改模型结构。通过大量实验,我们证明RAVEN显著优于ATLAS,并且在某些场景下,尽管参数数量大幅减少,仍能达到与最先进语言模型相媲美的结果。我们的工作强调了检索增强型编码器-解码器语言模型在上下文学习中的潜力,并鼓励在这一方向上的进一步研究。