This paper explores how Generative Adversarial Networks (GANs) learn representations of phonological phenomena. We analyze how GANs encode contrastive and non-contrastive nasality in French and English vowels by applying the ciwGAN architecture (Begus 2021a). Begus claims that ciwGAN encodes linguistically meaningful representations with categorical variables in its latent space and manipulating the latent variables shows an almost one to one corresponding control of the phonological features in ciwGAN's generated outputs. However, our results show an interactive effect of latent variables on the features in the generated outputs, which suggests the learned representations in neural networks are different from the phonological representations proposed by linguists. On the other hand, ciwGAN is able to distinguish contrastive and noncontrastive features in English and French by encoding them differently. Comparing the performance of GANs learning from different languages results in a better understanding of what language specific features contribute to developing language specific phonological representations. We also discuss the role of training data frequencies in phonological feature learning.
翻译:本文探讨生成对抗网络(GANs)如何学习音系现象的表征。我们通过应用ciwGAN架构(Begus 2021a)分析GANs如何编码法语和英语元音中的对比性与非对比性鼻音特征。Begus声称ciwGAN在其潜在空间中以分类变量形式编码具有语言学意义的表征,且对潜在变量的操控几乎能一对一地对应控制ciwGAN生成输出中的音系特征。然而,我们的结果显示潜在变量对生成输出中的特征存在交互效应,这表明神经网络学习到的表征与语言学家提出的音系表征存在差异。另一方面,ciwGAN能够通过差异化编码方式来区分英语和法语中的对比性与非对比性特征。通过比较GANs从不同语言中学习时的表现,有助于更深入地理解何种语言特异性特征有助于发展语言特异性音系表征。我们还探讨了训练数据频率在音系特征学习中的作用。