A wide variety of orthographic coding schemes and models of visual word identification have been developed to account for masked priming data that provide a measure of orthographic similarity between letter strings. These models tend to include hand-coded orthographic representations with single unit coding for specific forms of knowledge (e.g., units coding for a letter in a given position or a letter sequence). Here we assess how well a range of these coding schemes and models account for the pattern of form priming effects taken from the Form Priming Project and compare these findings to results observed in with 11 standard deep neural network models (DNNs) developed in computer science. We find that deep convolutional networks perform as well or better than the coding schemes and word recognition models, whereas transformer networks did less well. The success of convolutional networks is remarkable as their architectures were not developed to support word recognition (they were designed to perform well on object recognition) and they classify pixel images of words (rather artificial encodings of letter strings). The findings add to the recent work of (Hannagan et al., 2021) suggesting that convolutional networks may capture key aspects of visual word identification.
翻译:多种正字法编码方案和视觉词语识别模型已被开发出来,用以解释掩蔽启动数据——这些数据提供了字母串间正字法相似性的度量。这些模型通常包含手工编码的正字法表征,并采用特定知识形式的单单元编码(例如,对特定位置上的字母或字母序列进行编码的单元)。在此,我们评估了这些编码方案和模型在多大程度上能够解释从词形启动项目中提取的词形启动效应模式,并将这些发现与计算机科学领域开发的11个标准深度神经网络模型的结果进行比较。我们发现,深卷积网络的表现与编码方案和词语识别模型相当或更优,而Transformer网络的表现则较差。卷积网络的成功引人注目,因为其架构并非为支持词语识别而开发(它们旨在优化目标识别性能),并且它们对词语的像素图像(而非字母串的人工编码)进行分类。这些发现补充了(Hannagan等人,2021)近期的工作,表明卷积网络可能捕捉到视觉词语识别的关键方面。