Over the past two decades, numerous studies have demonstrated how less predictable (i.e., higher surprisal) words take more time to read. In general, these studies have implicitly assumed the reading process is purely responsive: Readers observe a new word and allocate time to process it as required. We argue that prior results are also compatible with a reading process that is at least partially anticipatory: Readers could make predictions about a future word and allocate time to process it based on their expectation. In this work, we operationalize this anticipation as a word's contextual entropy. We assess the effect of anticipation on reading by comparing how well surprisal and contextual entropy predict reading times on four naturalistic reading datasets: two self-paced and two eye-tracking. Experimentally, across datasets and analyses, we find substantial evidence for effects of contextual entropy over surprisal on a word's reading time (RT): in fact, entropy is sometimes better than surprisal in predicting a word's RT. Spillover effects, however, are generally not captured by entropy, but only by surprisal. Further, we hypothesize four cognitive mechanisms through which contextual entropy could impact RTs -- three of which we are able to design experiments to analyze. Overall, our results support a view of reading that is not just responsive, but also anticipatory.
翻译:过去二十年间,大量研究表明,可预测性较低(即高意外度)的词汇需要更长的阅读时间。这些研究通常隐含地假设阅读过程纯粹是反应性的:读者观察到新词后,根据需要分配时间进行加工。我们认为,先前的结果也与至少部分具有预期性的阅读过程兼容:读者可能对未来词汇进行预测,并基于其预期分配加工时间。在本研究中,我们将这种预期操作化为词汇的语境熵。通过比较意外度和语境熵在四个自然阅读数据集(两个自定步速阅读和两个眼动追踪)上对阅读时间的预测能力,我们评估了预期对阅读的影响。实验结果显示,跨数据集和分析中,我们发现了语境熵相对于意外度对词汇阅读时间影响的显著证据:事实上,熵在预测词汇阅读时间方面有时优于意外度。然而,溢出效应通常不被熵捕捉,而仅能被意外度捕捉。此外,我们提出了四种语境熵可能影响阅读时间的认知机制——其中三种我们能够设计实验进行分析。总体而言,我们的结果支持一种不仅是反应性的,而且是预期性的阅读观点。