A wide variety of orthographic coding schemes and models of visual word identification have been developed to account for masked priming data that provide a measure of orthographic similarity between letter strings. These models tend to include hand-coded orthographic representations with single unit coding for specific forms of knowledge (e.g., units coding for a letter in a given position). Here we assess how well a range of these coding schemes and models account for the pattern of form priming effects taken from the Form Priming Project and compare these findings to results observed with 11 standard deep neural network models (DNNs) developed in computer science. We find that deep convolutional networks (CNNs) perform as well or better than the coding schemes and word recognition models, whereas transformer networks did less well. The success of CNNs is remarkable as their architectures were not developed to support word recognition (they were designed to perform well on object recognition), they classify pixel images of words (rather than artificial encodings of letter strings), and their training was highly simplified (not respecting many key aspects of human experience). In addition to these form priming effects, we find that the DNNs can account for visual similarity effects on priming that are beyond all current psychological models of priming. The findings add to the recent work of (Hannagan et al., 2021) and suggest that CNNs should be given more attention in psychology as models of human visual word recognition.
翻译:为解释掩蔽启动数据(一种衡量字母串正字法相似性的指标),研究者已开发多种正字法编码方案与视觉单词识别模型。这类模型通常采用手工编码的正字法表征,通过针对特定知识形式的单单元编码(如特定位置字母编码单元)实现。本研究评估了多种编码方案与模型对形式启动项目库中启动效应模式的解释能力,并将其与计算机科学领域开发的11种标准深度神经网络模型的结果进行比较。研究发现,深度卷积网络(CNN)的表现优于或等同于现有编码方案与单词识别模型,而Transformer网络表现较差。CNN的成功尤为显著:其架构并非专为单词识别设计(原为物体识别优化),处理的是单词像素图像(而非人工编码的字母串),且训练过程高度简化(未考虑人类经验的多项关键特征)。除形式启动效应外,我们观察到DNN能解释超越当前所有心理启动模型的视觉相似性效应。该发现补充了Hannagan等人(2021)的最新研究,表明心理学领域应更多关注CNN作为人类视觉单词识别模型的价值。