Frequency effects in Linear Discriminative Learning

Word frequency is a strong predictor in most lexical processing tasks. Thus, any model of word recognition needs to account for how word frequency effects arise. The Discriminative Lexicon Model (DLM; Baayen et al., 2018a, 2019) models lexical processing with linear mappings between words' forms and their meanings. So far, the mappings can either be obtained incrementally via error-driven learning, a computationally expensive process able to capture frequency effects, or in an efficient, but frequency-agnostic closed-form solution modelling the theoretical endstate of learning (EL) where all words are learned optimally. In this study we show how an efficient, yet frequency-informed mapping between form and meaning can be obtained (Frequency-informed learning; FIL). We find that FIL well approximates an incremental solution while being computationally much cheaper. FIL shows a relatively low type- and high token-accuracy, demonstrating that the model is able to process most word tokens encountered by speakers in daily life correctly. We use FIL to model reaction times in the Dutch Lexicon Project (Keuleers et al., 2010) and find that FIL predicts well the S-shaped relationship between frequency and the mean of reaction times but underestimates the variance of reaction times for low frequency words. FIL is also better able to account for priming effects in an auditory lexical decision task in Mandarin Chinese (Lee, 2007), compared to EL. Finally, we used ordered data from CHILDES (Brown, 1973; Demuth et al., 2006) to compare mappings obtained with FIL and incremental learning. The mappings are highly correlated, but with FIL some nuances based on word ordering effects are lost. Our results show how frequency effects in a learning model can be simulated efficiently by means of a closed-form solution, and raise questions about how to best account for low-frequency words in cognitive models.

翻译：词频是大多数词汇处理任务中的强预测因子。因此，任何词汇识别模型都需要解释词频效应是如何产生的。判别性词汇模型（Discriminative Lexicon Model，简称DLM；Baayen 等人，2018a，2019）通过词汇形式与意义之间的线性映射来建模词汇处理过程。目前，这些映射可以通过两种方式获得：一是通过基于误差驱动的增量学习，这是一种计算成本高昂但能捕捉频率效应的方法；二是通过一种高效但忽视频率的闭式解，模拟学习的理论终态（EL），在此状态下所有词汇均被最优地学习。在本研究中，我们展示了如何获得一种既高效又能反映频率信息的形式与意义映射（频率信息学习，简称FIL）。我们发现，FIL在计算成本大幅降低的情况下，能很好地近似增量学习的结果。FIL表现出较低的类型准确率和较高的词例准确率，表明该模型能够正确处理说话者在日常语言中使用的大多数词汇。我们使用FIL对荷兰语词汇项目（Keuleers 等人，2010）中的反应时进行建模，发现FIL能够较好地预测频率与反应时均值之间的S形关系，但低估了低频词反应时的方差。与EL相比，FIL在解释普通话听觉词汇判断任务中的启动效应（Lee，2007）方面也表现更优。最后，我们使用来自CHILDES（Brown，1973；Demuth 等人，2006）的有序数据，比较了通过FIL和增量学习获得的映射。这些映射具有高度相关性，但FIL丢失了部分基于词汇顺序效应的细微差别。我们的结果展示了如何通过闭式解高效地模拟学习模型中的频率效应，并引发了关于如何在认知模型中最优地处理低频词的问题。