As a first step towards a complete computational model of speech learning involving perception-production loops, we investigate the forward mapping between pseudo-motor commands and articulatory trajectories. Two phonological feature sets, based respectively on generative and articulatory phonology, are used to encode a phonetic target sequence. Different interpolation techniques are compared to generate smooth trajectories in these feature spaces, with a potential optimisation of the target value and timing to capture co-articulation effects. We report the Pearson correlation between a linear projection of the generated trajectories and articulatory data derived from a multi-speaker dataset of electromagnetic articulography (EMA) recordings. A correlation of 0.67 is obtained with an extended feature set based on generative phonology and a linear interpolation technique. We discuss the implications of our results for our understanding of the dynamics of biological motion.
翻译:作为构建涉及感知-产出循环的完整语音学习计算模型的第一步,我们研究了伪运动指令与发音轨迹之间的前向映射关系。本文采用分别基于生成音系学与发音音系学的两套音韵特征集对语音目标序列进行编码。通过比较不同插值技术在这些特征空间中生成平滑轨迹的效果,并对目标值与时间参数进行潜在优化以捕捉协同发音效应。我们通过线性投影将生成轨迹与源自多说话者电磁发音仪记录数据集的实际发音数据进行相关性分析,并报告其皮尔逊相关系数。采用基于生成音系学的扩展特征集配合线性插值技术,可获得0.67的相关系数。最后,我们探讨了该研究结果对理解生物运动动力学的启示。