Revisiting Non-Verbatim Memorization in Large Language Models: The Role of Entity Surface Forms

Understanding what kinds of factual knowledge large language models (LLMs) memorize is essential for evaluating their reliability and limitations. Entity-based QA is a common framework for analyzing non-verbatim memorization, but typical evaluations query each entity using a single canonical surface form, making it difficult to disentangle fact memorization from access through a particular name. We introduce RedirectQA, an entity-based QA dataset that uses Wikipedia redirect information to associate Wikidata factual triples with categorized surface forms for each entity, including alternative names, abbreviations, spelling variants, and common erroneous forms. Across 13 LLMs, we examine surface-conditioned factual memorization and find that prediction outcomes often change when only the entity surface form changes. This inconsistency is category-dependent: models are more robust to minor orthographic variations than to larger lexical variations such as aliases and abbreviations. Frequency analyses further suggest that both entity- and surface-level frequencies are associated with accuracy, and that entity frequency often contributes beyond surface frequency. Overall, factual memorization appears neither purely surface-specific nor fully surface-invariant, highlighting the importance of surface-form diversity in evaluating non-verbatim memorization.

翻译：理解大型语言模型（LLMs）记忆何种事实性知识，对于评估其可靠性和局限性至关重要。基于实体的问答是分析非逐字记忆的常见框架，但典型评估仅使用单一规范表面形式查询每个实体，这使得难以区分事实记忆与通过特定名称的访问。我们提出RedirectQA，一个基于实体的问答数据集，利用维基百科重定向信息将维基数据事实三元组与每个实体的分类表面形式相关联，包括别名、缩写、拼写变体及常见错误形式。在13个LLMs上，我们检验了表面条件化的事实记忆，发现仅改变实体表面形式时预测结果常常发生变化。这种不一致性具有类别依赖性：模型对微小正字法变异的鲁棒性优于对较大词汇变异（如别名和缩写）的鲁棒性。频率分析进一步表明，实体级和表面级频率均与准确性相关，且实体频率的贡献往往超过表面频率。总体而言，事实记忆既非纯表面特异性，也非完全表面不变性，这凸显了表面形式多样性在评估非逐字记忆中的重要性。

相关内容

实体

关注 12

实体（entity）是有可区别性且独立存在的某种事物，但它不需要是物质上的存在。尤其是抽象和法律拟制也通常被视为实体。实体可被看成是一包含有子集的集合。在哲学里，这种集合被称为客体。实体可被使用来指涉某个可能是人、动物、植物或真菌等不会思考的生命、无生命物体或信念等的事物。在这一方面，实体可以被视为一全包的词语。有时，实体被当做本质的广义，不论即指的是否为物质上的存在，如时常会指涉到的无物质形式的实体－语言。更有甚者，实体有时亦指存在或本质本身。在法律上，实体是指能具有权利和义务的事物。这通常是指法人，但也包括自然人。

评估大语言模型在科学发现中的作用

专知会员服务

19+阅读 · 2025年12月19日

【CMU博士论文】大型语言模型的隐性特性

专知会员服务

15+阅读 · 2025年10月18日

如何将领域知识注入大模型？最新《将领域特定知识注入大语言模型》综述

专知会员服务

79+阅读 · 2025年2月24日

【CMU博士论文】朝着更准确的大型语言模型：参数化和非参数化方法

专知会员服务

37+阅读 · 2024年7月24日