fMRI predictors based on language models of increasing complexity recover brain left lateralization

Over the past decade, studies of naturalistic language processing where participants are scanned while listening to continuous text have flourished. Using word embeddings at first, then large language models, researchers have created encoding models to analyze the brain signals. Presenting these models with the same text as the participants allows to identify brain areas where there is a significant correlation between the functional magnetic resonance imaging (fMRI) time series and the ones predicted by the models' artificial neurons. One intriguing finding from these studies is that they have revealed highly symmetric bilateral activation patterns, somewhat at odds with the well-known left lateralization of language processing. Here, we report analyses of an fMRI dataset where we manipulate the complexity of large language models, testing 28 pretrained models from 8 different families, ranging from 124M to 14.2B parameters. First, we observe that the performance of models in predicting brain responses follows a scaling law, where the fit with brain activity increases linearly with the logarithm of the number of parameters of the model (and its performance on natural language processing tasks). Second, we show that a left-right asymmetry gradually appears as model size increases, and that the difference in left-right brain correlations also follows a scaling law. Whereas the smallest models show no asymmetry, larger models fit better and better left hemispheric activations than right hemispheric ones. This finding reconciles computational analyses of brain activity using large language models with the classic observation from aphasic patients showing left hemisphere dominance for language.

翻译：过去十年间，自然语言处理的神经影像研究蓬勃发展，这类研究让参与者在聆听连续文本的同时接受脑部扫描。研究者最初使用词嵌入技术，随后采用大型语言模型构建编码模型来分析大脑信号。通过向这些模型输入与参与者相同的文本，可以识别出功能磁共振成像（fMRI）时间序列与模型人工神经元预测序列存在显著相关性的脑区。这些研究揭示了一个引人注目的发现：它们呈现出高度对称的双侧激活模式，这与语言处理具有明确左半球优势的经典认知存在一定矛盾。本研究通过操控大型语言模型的复杂度，对fMRI数据集进行分析，测试了来自8个不同架构系列的28个预训练模型，参数量范围从1.24亿到142亿不等。首先，我们发现模型预测大脑反应的能力遵循缩放定律：模型与大脑活动的拟合度随参数数量（及其在自然语言处理任务上的性能）的对数线性增长。其次，我们证明随着模型规模增大，左右半球不对称性逐渐显现，且左右脑相关性的差异同样遵循缩放定律。最小模型未表现出不对称性，而较大模型对左半球激活的拟合度持续优于右半球。这一发现弥合了基于大型语言模型的脑活动计算分析与失语症患者研究之间的分歧，后者历来表明左半球在语言功能中占主导地位。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【机器学习术语宝典】机器学习中英文术语表

专知会员服务

61+阅读 · 2020年7月12日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日