A recent study (Kuribayashi et al., 2025) has shown that human sentence processing behavior, typically measured on syntactically unchallenging constructions, can be effectively modeled using surprisal from early layers of large language models (LLMs). This raises the question of whether such advantages of internal layers extend to more syntactically challenging constructions, where surprisal has been reported to underestimate human cognitive effort. In this paper, we begin by exploring internal layers that better estimate human cognitive effort observed in syntactic ambiguity processing in English. Our experiments show that, in contrast to naturalistic reading, later layers better estimate such a cognitive effort, but still underestimate the human data. This dual alignment sheds light on different modes of sentence processing in humans and LMs: naturalistic reading employs a somewhat weak prediction akin to earlier layers of LMs, while syntactically challenging processing requires more fully-contextualized representations, better modeled by later layers of LMs. Motivated by these findings, we also explore several probability-update measures using shallow and deep layers of LMs, showing a complementary advantage to single-layer's surprisal in reading time modeling.
翻译:一项近期研究(Kuribayashi等人,2025)表明,通常基于句法结构简单句式测量的人类句子加工行为,可通过大型语言模型(LLM)早期层的惊奇度得到有效建模。这引发了一个问题:当报告称惊奇度低估人类认知努力时,内部层的这种优势是否可延伸至句法结构更复杂的句式。本文首先探索能更好估计英语句法歧义加工中人类认知努力的内部层。实验表明,与自然主义阅读相反,后期层能更好地估计此类认知努力,但仍低估人类数据。这种双重对齐揭示了人类与语言模型在句子加工中的不同模式:自然主义阅读采用类似于语言模型早期层的弱预测机制,而句法复杂加工需要更充分的语境化表征,可通过语言模型后期层更好地建模。基于这些发现,我们还探讨了利用语言模型浅层和深层的若干概率更新度量方法,在阅读时间建模中展现出对单层惊奇度的互补优势。