Building systems that achieve a deeper understanding of language is one of the central goals of natural language processing (NLP). Towards this goal, recent works have begun to train language models on narrative datasets which require extracting the most critical information by integrating across long contexts. However, it is still an open question whether these models are learning a deeper understanding of the text, or if the models are simply learning a heuristic to complete the task. This work investigates this further by turning to the one language processing system that truly understands complex language: the human brain. We show that training language models for deeper narrative understanding results in richer representations that have improved alignment to human brain activity. We further find that the improvements in brain alignment are larger for character names than for other discourse features, which indicates that these models are learning important narrative elements. Taken together, these results suggest that this type of training can indeed lead to deeper language understanding. These findings have consequences both for cognitive neuroscience by revealing some of the significant factors behind brain-NLP alignment, and for NLP by highlighting that understanding of long-range context can be improved beyond language modeling.
翻译:构建能实现更深层次语言理解的系统是自然语言处理的核心目标之一。为此,近期研究开始使用叙事数据集训练语言模型,这类任务要求通过整合长程上下文提取最关键信息。然而,目前仍存在一个开放性问题:这些模型究竟是在学习对文本的深层理解,抑或仅掌握了完成任务所需的启发式策略?本研究通过转向真正理解复杂语言的语言处理系统——人脑,进一步探索此问题。我们表明,针对深层叙事理解训练语言模型,能产生更丰富的表征,从而提升与人类大脑活动的对齐程度。进一步发现,相较于其他话语特征,角色名称在大脑对齐性上的提升更为显著,这表明模型正在学习重要的叙事要素。综合来看,这些结果暗示此类训练确实能导向更深层的语言理解。这些发现对认知神经科学揭示脑—自然语言处理对齐的关键因素,以及自然语言处理领域强调超越语言建模改进长程上下文理解,均具有启示意义。