While brain-aligned large language models (LLMs) have garnered attention for their potential as cognitive models and for potential for enhanced safety and trustworthiness in AI, the role of this brain alignment for linguistic competence remains uncertain. In this work, we investigate the functional implications of brain alignment by introducing brain-misaligned models--LLMs intentionally trained to predict brain activity poorly while maintaining high language modeling performance. We evaluate these models on over 200 downstream tasks encompassing diverse linguistic domains, including semantics, syntax, discourse, reasoning, and morphology. By comparing brain-misaligned models with well-matched brain-aligned counterparts, we isolate the specific impact of brain alignment on language understanding. Our experiments reveal that brain misalignment substantially impairs downstream performance, highlighting the critical role of brain alignment in achieving robust linguistic competence. These findings underscore the importance of brain alignment in LLMs and offer novel insights into the relationship between neural representations and linguistic processing.
翻译:尽管与大脑对齐的大型语言模型(LLMs)因其作为认知模型的潜力以及在提升AI安全性与可信度方面的优势而备受关注,但这种大脑对齐对语言能力的具体作用仍不明确。本研究通过引入"大脑失调模型"——即被有意训练为弱化脑活动预测能力但保持高语言建模性能的LLMs,系统探究了大脑对齐的功能意义。我们在涵盖语义、句法、语篇、推理和形态学等200余项下游任务中评估这些模型。通过将大脑失调模型与严格匹配的大脑对齐模型进行对比,我们分离出大脑对齐对语言理解的特定影响。实验结果表明,大脑失调会显著损害下游任务表现,凸显了大脑对齐在实现稳健语言能力中的关键作用。这些发现不仅强调了LLMs中大脑对齐的重要性,也为神经表征与语言加工之间的关系提供了新的见解。