We present a new perspective on how readers integrate context during real-time language comprehension. Our proposals build on surprisal theory, which posits that the processing effort of a linguistic unit (e.g., a word) is an affine function of its in-context information content. We first observe that surprisal is only one out of many potential ways that a contextual predictor can be derived from a language model. Another one is the pointwise mutual information (PMI) between a unit and its context, which turns out to yield the same predictive power as surprisal when controlling for unigram frequency. Moreover, both PMI and surprisal are correlated with frequency. This means that neither PMI nor surprisal contains information about context alone. In response to this, we propose a technique where we project surprisal onto the orthogonal complement of frequency, yielding a new contextual predictor that is uncorrelated with frequency. Our experiments show that the proportion of variance in reading times explained by context is a lot smaller when context is represented by the orthogonalized predictor. From an interpretability standpoint, this indicates that previous studies may have overstated the role that context has in predicting reading times.
翻译:本文提出了一种关于读者在实时语言理解过程中如何整合语境的新视角。我们的研究基于惊奇理论,该理论认为语言单位(如单词)的处理努力是其语境信息内容的仿射函数。我们首先观察到,惊奇只是从语言模型中推导出语境预测因子的多种潜在方式之一。另一种方式是语言单位与其语境之间的点间互信息(PMI),在控制单字频率的情况下,PMI被证明具有与惊奇相同的预测能力。此外,PMI和惊奇均与频率相关。这意味着PMI和惊奇均不包含纯粹的语境信息。针对这一问题,我们提出一种技术方法:将惊奇投影到频率的正交补空间上,从而得到一个与频率无关的新语境预测因子。实验表明,当语境由正交化预测因子表示时,语境所解释的阅读时间方差比例显著降低。从可解释性角度看,这表明以往研究可能高估了语境在预测阅读时间中的作用。