Traditional approaches for understanding phonological learning have predominantly relied on curated text data. Although insightful, such approaches limit the knowledge captured in textual representations of the spoken language. To overcome this limitation, we investigate the potential of the Featural InfoWaveGAN model to learn iterative long-distance vowel harmony using raw speech data. We focus on Assamese, a language known for its phonologically regressive and word-bound vowel harmony. We demonstrate that the model is adept at grasping the intricacies of Assamese phonotactics, particularly iterative long-distance harmony with regressive directionality. It also produced non-iterative illicit forms resembling speech errors during human language acquisition. Our statistical analysis reveals a preference for a specific [+high,+ATR] vowel as a trigger across novel items, indicative of feature learning. More data and control could improve model proficiency, contrasting the universality of learning.
翻译:传统理解音系学习的方法主要依赖于经过整理的文本数据。尽管具有启发性,但此类方法限制了从口语的文本表征中获取的知识。为克服这一局限性,本研究探讨了特征信息波生成对抗网络模型利用原始语音数据学习迭代性长距离元音和谐的潜力。我们以阿萨姆语为研究对象,该语言以其音系上的逆同化和词界内元音和谐现象而闻名。研究表明,该模型能够有效掌握阿萨姆语音系结构的复杂性,特别是具有逆向传递性的迭代性长距离和谐现象。模型还生成了类似人类语言习得过程中言语错误的非迭代违规形式。统计分析显示,模型在处理新词项时偏好特定的[+高位,+紧喉性]元音作为触发因素,这体现了特征学习机制。增加数据量和控制条件可提升模型熟练度,这与学习机制的普遍性形成对比。