Large Language Models (LLMs) demonstrate human-level or even superior language abilities, effectively modeling syntactic structures, yet the specific computational modules responsible remain unclear. A key question is whether LLM behavioral capabilities stem from mechanisms akin to those in the human brain. To address these questions, we introduce the Hierarchical Frequency Tagging Probe (HFTP), a tool that utilizes frequency-domain analysis to identify neuron-wise components of LLMs (e.g., individual Multilayer Perceptron (MLP) neurons) and cortical regions (via intracranial recordings) encoding syntactic structures. Our results show that models such as GPT-2, Gemma, Gemma 2, Llama 2, Llama 3.1, and GLM-4 process syntax in analogous layers, while the human brain relies on distinct cortical regions for different syntactic levels. Representational similarity analysis reveals a stronger alignment between LLM representations and the left hemisphere of the brain (dominant in language processing). Notably, upgraded models exhibit divergent trends: Gemma 2 shows greater brain similarity than Gemma, while Llama 3.1 shows less alignment with the brain compared to Llama 2. These findings offer new insights into the interpretability of LLM behavioral improvements, raising questions about whether these advancements are driven by human-like or non-human-like mechanisms, and establish HFTP as a valuable tool bridging computational linguistics and cognitive neuroscience. This project is available at https://github.com/LilTiger/HFTP.
翻译:大型语言模型(LLM)展现出人类水平甚至更优的语言能力,能有效建模句法结构,但负责的具体计算模块仍不明确。一个关键问题是LLM的行为能力是否源于与人脑相似的机制。为解决这些问题,我们提出了层级频率标记探针(HFTP),该工具利用频域分析来识别LLM中编码句法结构的神经元级组件(例如,单个多层感知机(MLP)神经元)以及皮层区域(通过颅内记录)。我们的结果表明,GPT-2、Gemma、Gemma 2、Llama 2、Llama 3.1和GLM-4等模型在类似层级处理句法,而人脑则依赖不同的皮层区域处理不同层级的句法。表征相似性分析显示,LLM的表征与大脑左半球(在语言处理中占主导地位)有更强的对齐性。值得注意的是,升级模型表现出不同的趋势:Gemma 2比Gemma显示出更大的大脑相似性,而Llama 3.1与大脑的对齐性相比Llama 2更低。这些发现为LLM行为改进的可解释性提供了新的见解,引发了这些进步是由类人还是非类人机制驱动的疑问,并确立了HFTP作为连接计算语言学和认知神经科学的有价值工具。本项目可在 https://github.com/LilTiger/HFTP 获取。