Large Language Models are Fixated by Red Herrings: Exploring Creative Problem Solving and Einstellung Effect using the Only Connect Wall Dataset

from arxiv, v4,v3: Mincor cosmetic adjustments, typo-fixes etc. from V2. Fixed Fig. 2 caption overlapping with text in S2.2. V2: with added OCW-Randomized and OCW-WordNet results in Section 4.3 (added). 22 pages with Appendix

The quest for human imitative AI has been an enduring topic in AI research since its inception. The technical evolution and emerging capabilities of the latest cohort of large language models (LLMs) have reinvigorated the subject beyond academia to the cultural zeitgeist. While recent NLP evaluation benchmark tasks test some aspects of human-imitative behaviour (e.g., BIG-bench's 'human-like behavior' tasks), few, if not none, examine creative problem solving abilities. Creative problem solving in humans is a well-studied topic in cognitive neuroscience with standardized tests that predominantly use the ability to associate (heterogeneous) connections among clue words as a metric for creativity. Exposure to misleading stimuli - distractors dubbed red herrings - impede human performance in such tasks via the fixation effect and Einstellung paradigm. In cognitive neuroscience studies, such fixations are experimentally induced by pre-exposing participants to orthographically similar incorrect words to subsequent word-fragments or clues. The popular British quiz show Only Connect's Connecting Wall segment essentially mimics Mednick's Remote Associates Test (RAT) formulation with built-in, deliberate red herrings, which makes it an ideal proxy dataset to explore and study fixation effect and Einstellung paradigm from cognitive neuroscience in LLMs. In this paper we present the novel Only Connect Wall (OCW) dataset and report results from our evaluation of selected pre-trained language models and LLMs on creative problem solving tasks like grouping clue words by heterogeneous connections, and identifying correct open knowledge domain connections in respective groups. We synthetically generate two additional datasets: OCW-Randomized, OCW-WordNet to further analyze our red-herrings hypothesis in language models. The code and link to the dataset are available at https://github.com/TaatiTeam/OCW.

翻译：自人工智能研究诞生以来，追求类人AI始终是一个持久课题。最新一代大型语言模型的技术演进与新兴能力，使该议题不仅限于学术界，更成为文化潮流中的焦点。尽管近期自然语言处理评估基准任务（如BIG-bench的"类人行为"任务）测试了部分人类模仿行为特征，但极少（甚至没有）研究涉及创造性问题解决能力。在认知神经科学中，人类创造性问题解决已有成熟标准化测试，主要将（异质）线索词关联能力作为创造力指标。暴露于误导性刺激（即所谓的"红鲱鱼"干扰项）会通过思维固化效应与心理定势范式削弱人类任务表现。认知神经科学研究中，此类思维固化通常通过预先让参与者接触与后续词片段或线索拼写相似但语义错误的词汇进行实验诱导。英国热门智力竞赛节目《Only Connect》的"连接墙"环节本质上模仿了梅德尼克的远距离联想测验范式，并内置了精心设计的误导性"红鲱鱼"元素，使其成为探索大型语言模型中认知神经科学所定义的思维固化效应与心理定势范式的理想代理数据集。本文提出了全新的Only Connect Wall（OCW）数据集，报告了选定的预训练语言模型与大型语言模型在创造性问题解决任务（如基于异质关联对线索词分组、识别各组内正确开放知识领域关联）中的评估结果。我们通过合成方式额外生成了两个数据集（OCW-随机化扩展集与OCW-WordNet语义扩展集）以深入分析语言模型中的"红鲱鱼"假设。数据集代码与链接已开源至https://github.com/TaatiTeam/OCW。