探究大型语言模型中的上下文忠实性：记忆强度与证据呈现方式的作用 (Investigating Context-Faithfulness in Large Language Models: The Roles of Memory Strength and Evidence Style)

Retrieval-augmented generation (RAG) improves Large Language Models (LLMs) by incorporating external information into the response generation process. However, how context-faithful LLMs are and what factors influence LLMs' context-faithfulness remain largely unexplored. In this study, we investigate the impact of memory strength and evidence presentation on LLMs' receptiveness to external evidence. We introduce a method to quantify the memory strength of LLMs by measuring the divergence in LLMs' responses to different paraphrases of the same question, which is not considered by previous works. We also generate evidence in various styles to evaluate the effects of evidence in different styles. Two datasets are used for evaluation: Natural Questions (NQ) with popular questions and popQA featuring long-tail questions. Our results show that for questions with high memory strength, LLMs are more likely to rely on internal memory, particularly for larger LLMs such as GPT-4. On the other hand, presenting paraphrased evidence significantly increases LLMs' receptiveness compared to simple repetition or adding details.

翻译：检索增强生成（RAG）通过将外部信息整合到响应生成过程中，改进了大型语言模型（LLMs）。然而，LLMs在多大程度上忠实于上下文，以及哪些因素影响LLMs的上下文忠实性，在很大程度上仍未得到充分探索。在本研究中，我们探究了记忆强度和证据呈现方式对LLMs接受外部证据的影响。我们引入了一种量化LLMs记忆强度的方法，通过测量LLMs对同一问题不同释义版本响应的差异来实现，这一点在先前工作中未被考虑。我们还生成了多种风格的证据，以评估不同风格证据的效果。评估使用了两个数据集：包含常见问题的Natural Questions（NQ）和以长尾问题为特色的popQA。我们的结果表明，对于记忆强度高的问题，LLMs更倾向于依赖内部记忆，尤其是对于GPT-4等更大规模的LLMs。另一方面，与简单重复或添加细节相比，呈现释义后的证据能显著提高LLMs对证据的接受度。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日