Large language models are able to exploit in-context learning to access external knowledge beyond their training data through retrieval-augmentation. While promising, its inner workings remain unclear. In this work, we shed light on the mechanism of in-context retrieval augmentation for question answering by viewing a prompt as a composition of informational components. We propose an attribution-based method to identify specialized attention heads, revealing in-context heads that comprehend instructions and retrieve relevant contextual information, and parametric heads that store entities' relational knowledge. To better understand their roles, we extract function vectors and modify their attention weights to show how they can influence the answer generation process. Finally, we leverage the gained insights to trace the sources of knowledge used during inference, paving the way towards more safe and transparent language models.
翻译:大型语言模型能够利用上下文学习,通过检索增强获取超出其训练数据的外部知识。尽管前景广阔,其内部工作机制仍不明确。在本研究中,我们通过将提示视为信息组分的组合,揭示了面向问答任务的上下文检索增强机制。我们提出一种基于归因的方法来识别专用注意力头,发现了理解指令并检索相关上下文信息的上下文头,以及存储实体关系知识的参数化头。为深入理解其作用,我们提取功能向量并调整其注意力权重,以展示它们如何影响答案生成过程。最后,我们利用所获见解追踪推理过程中使用的知识来源,为构建更安全、更透明的语言模型开辟道路。