Are similar, or even identical, mechanisms used in the computational modeling of speech segmentation, serial image processing and music processing? We address this question by exploring how TRACX2, (French et al., 2011; French \& Cottrell, 2014; Mareschal \& French, 2017), a recognition-based, recursive connectionist autoencoder model of chunking and sequence segmentation, which has successfully simulated speech and serial-image processing, might be applied to elementary melody perception. The model, a three-layer autoencoder that recognizes ''chunks'' of short sequences of intervals that have been frequently encountered on input, is trained on the tone intervals of melodically simple French children's songs. It dynamically incorporates the internal representations of these chunks into new input. Its internal representations cluster in a manner that is consistent with ''human-recognizable'' melodic categories. TRACX2 is sensitive to both contour and proximity information in the musical chunks that it encounters in its input. It shows the ''end-of-word'' superiority effect demonstrated by Saffran et al. (1999) for short musical phrases. The overall findings suggest that the recursive autoassociative chunking mechanism, as implemented in TRACX2, may be a general segmentation and chunking mechanism, underlying not only word-and imagechunking, but also elementary melody processing.
翻译:在语音分割、序列图像处理和音乐处理的计算建模中,是否使用了相似甚至相同的机制?我们通过探索TRACX2(French等,2011;French & Cottrell,2014;Mareschal & French,2017)如何应用于基础旋律感知来探讨这一问题。TRACX2是一种基于识别的递归连接主义自编码器模型,专长于组块化和序列分割,已成功模拟了语音和序列图像处理。该模型是一个三层自编码器,可识别输入中频繁出现的短间隔序列"组块",并以法语音调简单的儿童歌曲的音程进行训练。它将这些组块的内部表征动态融入新输入中,其内部表征聚类方式与"人类可识别"的旋律类别一致。TRACX2对其输入中音乐组块的轮廓和邻近信息均敏感,并展现出Saffran等(1999)在短乐句中发现的"词尾"优势效应。总体结果表明,TRACX2实现的递归自联想组块化机制可能是一种通用的分割和组块化机制,不仅支撑词汇及图像组块化,也支撑基础旋律处理。