LLMs as Models for Analogical Reasoning

from arxiv, The title has been changed from Semantic Structure-Mapping in LLM and Human Analogical Reasoning to LLMs as Models for Analogical Reasoning to improve clarity and accuracy

Analogical reasoning-the capacity to identify and map structural relationships between different domains-is fundamental to human cognition and learning. Recent studies have shown that large language models (LLMs) can sometimes match humans in analogical reasoning tasks, opening the possibility that analogical reasoning might emerge from domain general processes. However, it is still debated whether these emergent capacities are largely superficial and limited to simple relations seen during training or whether they rather encompass the flexible representational and mapping capabilities which are the focus of leading cognitive models of analogy. In this study, we introduce novel analogical reasoning tasks that require participants to map between semantically contentful words and sequences of letters and other abstract characters. This task necessitates the ability to flexibly re-represent rich semantic information-an ability which is known to be central to human analogy but which is thus far not well-captured by existing cognitive theories and models. We assess the performance of both human participants and LLMs on tasks focusing on reasoning from semantic structure and semantic content, introducing variations that test the robustness of their analogical inferences. Advanced LLMs match human performance across several conditions, though humans and LLMs respond differently to certain task variations and semantic distractors. Our results thus provide new evidence that LLMs might offer a how-possibly explanation of human analogical reasoning in contexts that are not yet well modeled by existing theories, but that even today's best models are unlikely to yield how-actually explanations.

翻译：类比推理——识别并映射不同领域间结构关系的能力——是人类认知与学习的核心能力。近期研究表明，大语言模型（LLMs）在某些类比推理任务中能达到与人类相当的水平，这暗示类比推理可能源自领域通用的处理过程。然而，这些涌现的能力究竟是表面性的、仅限于训练中见过的简单关系，还是真正包含了灵活的表征与映射能力（这正是主流类比认知模型的核心焦点），目前仍存在争议。本研究引入新型类比推理任务，要求参与者在具有语义内容的词汇与字母序列及其他抽象字符之间建立映射。该任务需要灵活重组丰富语义信息的能力——这种能力已知是人类类比推理的关键，但现有认知理论与模型尚未能充分捕捉。我们评估了人类参与者与大语言模型在语义结构和语义内容推理任务上的表现，并通过任务变体测试其类比推理的稳健性。先进的大语言模型在多种条件下达到人类水平，但人类与模型对某些任务变体和语义干扰项的反应存在差异。因此，我们的研究结果提供了新证据：在现有理论尚未充分建模的语境中，大语言模型可能为人类类比推理提供"可能性解释"；但即使当前最优模型，也不太可能给出"实际性解释"。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日