MulCogBench: A Multi-modal Cognitive Benchmark Dataset for Evaluating Chinese and English Computational Language Models

Pre-trained computational language models have recently made remarkable progress in harnessing the language abilities which were considered unique to humans. Their success has raised interest in whether these models represent and process language like humans. To answer this question, this paper proposes MulCogBench, a multi-modal cognitive benchmark dataset collected from native Chinese and English participants. It encompasses a variety of cognitive data, including subjective semantic ratings, eye-tracking, functional magnetic resonance imaging (fMRI), and magnetoencephalography (MEG). To assess the relationship between language models and cognitive data, we conducted a similarity-encoding analysis which decodes cognitive data based on its pattern similarity with textual embeddings. Results show that language models share significant similarities with human cognitive data and the similarity patterns are modulated by the data modality and stimuli complexity. Specifically, context-aware models outperform context-independent models as language stimulus complexity increases. The shallow layers of context-aware models are better aligned with the high-temporal-resolution MEG signals whereas the deeper layers show more similarity with the high-spatial-resolution fMRI. These results indicate that language models have a delicate relationship with brain language representations. Moreover, the results between Chinese and English are highly consistent, suggesting the generalizability of these findings across languages.

翻译：预训练计算语言模型近期在掌握曾被认为人类独有的语言能力方面取得了显著进展。其成功引发了关于这些模型是否像人类一样表征和处理语言的兴趣。为回答这一问题，本文提出了MulCogBench——一个从中文和英文母语者中收集的多模态认知基准数据集。该数据集涵盖多种认知数据，包括主观语义评分、眼动追踪、功能性磁共振成像（fMRI）和脑磁图（MEG）。为了评估语言模型与认知数据之间的关系，我们进行了基于相似性编码的分析，该分析通过文本嵌入与认知数据的模式相似性对其进行解码。结果表明，语言模型与人类认知数据存在显著相似性，且相似性模式受数据模态和刺激复杂度的调节。具体而言，随着语言刺激复杂度的增加，上下文感知模型优于上下文无关模型。上下文感知模型的浅层与高时间分辨率的MEG信号更匹配，而深层则与高空间分辨率的fMRI更相似。这些结果表明语言模型与大脑语言表征之间存在微妙关系。此外，中文和英文的结果高度一致，揭示了这些发现跨语言的普适性。

相关内容

Cognition

关注 4

Cognition：Cognition：International Journal of Cognitive Science Explanation：认知：国际认知科学杂志。 Publisher：Elsevier。 SIT： http://www.journals.elsevier.com/cognition/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日