Graphs data is crucial for many applications, and much of it exists in the relations described in textual format. As a result, being able to accurately recall and encode a graph described in earlier text is a basic yet pivotal ability that LLMs need to demonstrate if they are to perform reasoning tasks that involve graph-structured information. Human performance at graph recall has been studied by cognitive scientists for decades, and has been found to often exhibit certain structural patterns of bias that align with human handling of social relationships. To date, however, we know little about how LLMs behave in analogous graph recall tasks: do their recalled graphs also exhibit certain biased patterns, and if so, how do they compare with humans and affect other graph reasoning tasks? In this work, we perform the first systematical study of graph recall by LLMs, investigating the accuracy and biased microstructures (local structural patterns) in their recall. We find that LLMs not only underperform often in graph recall, but also tend to favor more triangles and alternating 2-paths. Moreover, we find that more advanced LLMs have a striking dependence on the domain that a real-world graph comes from -- by yielding the best recall accuracy when the graph is narrated in a language style consistent with its original domain.
翻译:图数据对许多应用至关重要,且大量图数据存在于文本描述的关系中。因此,如果大语言模型要执行涉及图结构信息的推理任务,准确回忆并编码先前文本中描述的图是一项基础且关键的能力。认知科学家对人类图回忆能力的研究已持续数十年,发现人类回忆的图往往表现出某些与处理社会关系相符合的结构性偏差模式。然而,迄今为止,我们对大语言模型在类似图回忆任务中的表现知之甚少:它们回忆的图是否也表现出偏差模式?如果存在,这些模式如何与人类比较,又如何影响其他图推理任务?本文首次系统研究了大语言模型的图回忆能力,考察了其回忆的准确性和有偏差的微观结构(局部结构模式)。我们发现,大语言模型不仅在图回忆中常常表现不佳,而且倾向于偏好更多的三角形和交替的2-路径。此外,我们发现更先进的大语言模型对现实图来源领域有显著依赖性——当图以与其原始领域一致的语言风格叙述时,它们能实现最佳的回忆准确性。