Recent work has shown that larger language models have better predictive power for eye movement and reading time data. While even the best models under-allocate probability mass to human responses, larger models assign higher-quality estimates of next tokens and their likelihood of production in cloze data because they are less sensitive to lexical co-occurrence statistics while being better aligned semantically to human cloze responses. The results provide support for the claim that the greater memorization capacity of larger models helps them guess more semantically appropriate words, but makes them less sensitive to low-level information that is relevant for word recognition.
翻译:近期研究表明,较大规模的语言模型在眼动与阅读时间数据预测方面具有更强的预测能力。尽管即使最优模型对人工反馈的概率分配仍显不足,但更大规模的模型能够为完形数据中的下一词及其生成可能性提供更高质量的估计,这是因为它们对词汇共现统计的敏感性较低,同时在语义层面与人工完形反馈更为契合。研究结果为以下观点提供了支持:较大模型更强的记忆能力有助于其推测语义更恰当的词汇,但这也降低了它们对词汇识别相关低层次信息的敏感性。