Understanding what kinds of factual knowledge large language models (LLMs) memorize is essential for evaluating their reliability and limitations. Entity-based QA is a common framework for analyzing non-verbatim memorization, but typical evaluations query each entity using a single canonical surface form, making it difficult to disentangle fact memorization from access through a particular name. We introduce RedirectQA, an entity-based QA dataset that uses Wikipedia redirect information to associate Wikidata factual triples with categorized surface forms for each entity, including alternative names, abbreviations, spelling variants, and common erroneous forms. Across 13 LLMs, we examine surface-conditioned factual memorization and find that prediction outcomes often change when only the entity surface form changes. This inconsistency is category-dependent: models are more robust to minor orthographic variations than to larger lexical variations such as aliases and abbreviations. Frequency analyses further suggest that both entity- and surface-level frequencies are associated with accuracy, and that entity frequency often contributes beyond surface frequency. Overall, factual memorization appears neither purely surface-specific nor fully surface-invariant, highlighting the importance of surface-form diversity in evaluating non-verbatim memorization.
翻译:理解大型语言模型(LLMs)记忆何种事实性知识,对于评估其可靠性和局限性至关重要。基于实体的问答是分析非逐字记忆的常见框架,但典型评估仅使用单一规范表面形式查询每个实体,这使得难以区分事实记忆与通过特定名称的访问。我们提出RedirectQA,一个基于实体的问答数据集,利用维基百科重定向信息将维基数据事实三元组与每个实体的分类表面形式相关联,包括别名、缩写、拼写变体及常见错误形式。在13个LLMs上,我们检验了表面条件化的事实记忆,发现仅改变实体表面形式时预测结果常常发生变化。这种不一致性具有类别依赖性:模型对微小正字法变异的鲁棒性优于对较大词汇变异(如别名和缩写)的鲁棒性。频率分析进一步表明,实体级和表面级频率均与准确性相关,且实体频率的贡献往往超过表面频率。总体而言,事实记忆既非纯表面特异性,也非完全表面不变性,这凸显了表面形式多样性在评估非逐字记忆中的重要性。