In recent years, several influential computational models and metrics have been proposed to predict how humans comprehend and process sentence. One particularly promising approach is contextual semantic similarity. Inspired by the attention algorithm in Transformer and human memory mechanisms, this study proposes an ``attention-aware'' approach for computing contextual semantic relevance. This new approach takes into account the different contributions of contextual parts and the expectation effect, allowing it to incorporate contextual information fully. The attention-aware approach also facilitates the simulation of existing reading models and evaluate them. The resulting ``attention-aware'' metrics of semantic relevance can more accurately predict fixation durations in Chinese reading tasks recorded in an eye-tracking corpus than those calculated by existing approaches. The study's findings further provide strong support for the presence of semantic preview benefits in Chinese naturalistic reading. Furthermore, the attention-aware metrics of semantic relevance, being memory-based, possess high interpretability from both linguistic and cognitive standpoints, making them a valuable computational tool for modeling eye-movements in reading and further gaining insight into the process of language comprehension. Our approach underscores the potential of these metrics to advance our comprehension of how humans understand and process language, ultimately leading to a better understanding of language comprehension and processing.
翻译:近年来,研究者提出了多种有影响力的计算模型和指标来预测人类理解与加工句子的过程。其中,上下文语义相似度是一种极具前景的方法。受Transformer中的注意力算法及人类记忆机制的启发,本研究提出一种面向注意力的上下文语义相关性计算方法。该方法考虑了上下文组成部分的不同贡献以及预期效应,从而能够充分整合上下文信息。同时,该面向注意力的方法有助于模拟现有阅读模型并进行评估。通过眼动追踪语料库中的中文阅读任务数据发现,相比现有方法计算得到的指标,本研究中面向注意力的语义相关性指标能更精确地预测注视时长。研究结果进一步为中文自然阅读中语义预览效益的存在提供了有力证据。此外,基于记忆机制的注意力语义相关性指标兼具语言学与认知科学的可解释性,为模拟阅读中的眼动行为及深化语言理解过程研究提供了有价值的计算工具。我们的方法凸显了这类指标在推动人类语言理解与加工机制认知方面的潜力,最终将促进对语言理解与加工过程的更深入理解。