Closely related languages show linguistic similarities that allow speakers of one language to understand speakers of another language without having actively learned it. Mutual intelligibility varies in degree and is typically tested in psycholinguistic experiments. To study mutual intelligibility computationally, we propose a computer-assisted method using the Linear Discriminative Learner, a computational model developed to approximate the cognitive processes by which humans learn languages, which we expand with multilingual semantic vectors and multilingual sound classes. We test the model on cognate data from German, Dutch, and English, three closely related Germanic languages. We find that our model's comprehension accuracy depends on 1) the automatic trimming of inflections and 2) the language pair for which comprehension is tested. Our multilingual modelling approach does not only offer new methodological findings for automatic testing of mutual intelligibility across languages but also extends the use of Linear Discriminative Learning to multilingual settings.
翻译:密切关联语言间存在语言相似性,使得一种语言的母语者无需主动学习便能理解另一种语言。相互可理解性程度各异,通常通过心理语言学实验进行测试。为从计算角度研究相互可理解性,我们提出了一种计算机辅助方法,该方法采用线性判别学习器——一种为近似人类语言学习认知过程而开发的计算模型,并通过多语言语义向量和多语言音类对其进行了扩展。我们在德语、荷兰语和英语这三种密切关联的日耳曼语言的同源数据上测试了该模型。研究发现,模型的理解准确性取决于:1) 词形变化的自动修剪,以及2) 测试理解的语言对。我们的多语言建模方法不仅为跨语言相互可理解性的自动测试提供了新的方法学发现,还将线性判别学习的应用扩展至多语言场景。