Ongoing breakthroughs in large language models (LLMs) are reshaping scholarly search and discovery interfaces. While these systems offer new possibilities for navigating scientific knowledge, they also raise concerns about fairness and representational bias rooted in the models' memorized training data. As LLMs are increasingly used to answer queries about researchers and research communities, their ability to accurately reconstruct scholarly coauthor lists becomes an important but underexamined issue. In this study, we investigate how memorization in LLMs affects the reconstruction of coauthor lists and whether this process reflects existing inequalities across academic disciplines and world regions. We evaluate three prominent models, DeepSeek R1, Llama 4 Scout, and Mixtral 8x7B, by comparing their generated coauthor lists against bibliographic reference data. Our analysis reveals a systematic advantage for highly cited researchers, indicating that LLM memorization disproportionately favors already visible scholars. However, this pattern is not uniform: certain disciplines, such as Clinical Medicine, and some regions, including parts of Africa, exhibit more balanced reconstruction outcomes. These findings highlight both the risks and limitations of relying on LLM-generated relational knowledge in scholarly discovery contexts and emphasize the need for careful auditing of memorization-driven biases in LLM-based systems.
翻译:大型语言模型(LLM)的持续突破正在重塑学术搜索与发现界面。尽管这些系统为导航科学知识提供了新的可能性,但也引发了人们对根植于模型记忆训练数据中的公平性与表征偏见的担忧。随着LLM越来越多地被用于回答关于研究人员和研究群体的查询,其准确重构学术合著者列表的能力成为一个重要但尚未得到充分审视的问题。在本研究中,我们探究了LLM中的记忆如何影响合著者列表的重构,以及这一过程是否反映了跨学科领域和世界区域间现存的不平等现象。我们通过将DeepSeek R1、Llama 4 Scout和Mixtral 8x7B这三个主流模型生成的合著者列表与文献参考数据进行对比,对它们进行了评估。我们的分析揭示了对高被引研究人员的系统性优势,表明LLM的记忆不成比例地偏袒已具高可见度的学者。然而,这一模式并非普遍存在:某些学科(如临床医学)和一些地区(包括非洲部分地区)展现出更为均衡的重构结果。这些发现凸显了在学术发现场景中依赖LLM生成的关系知识所存在的风险与局限,并强调了对基于LLM的系统中由记忆驱动的偏见进行审慎审计的必要性。