This paper investigates the challenges in building Swiss German speech translation systems, specifically focusing on the impact of dialect diversity and differences between Swiss German and Standard German. Swiss German is a spoken language with no formal writing system, it comprises many diverse dialects and is a low-resource language with only around 5 million speakers. The study is guided by two key research questions: how does the inclusion and exclusion of dialects during the training of speech translation models for Swiss German impact the performance on specific dialects, and how do the differences between Swiss German and Standard German impact the performance of the systems? We show that dialect diversity and linguistic differences pose significant challenges to Swiss German speech translation, which is in line with linguistic hypotheses derived from empirical investigations.
翻译:本文探讨了构建瑞士德语语音翻译系统所面临的挑战,特别关注方言多样性以及瑞士德语与标准德语之间的差异所带来的影响。瑞士德语是一种没有正式书写系统的口语语言,包含众多不同的方言,且属于资源匮乏的语言,仅有约500万使用者。本研究围绕两个关键研究问题展开:在瑞士德语语音翻译模型的训练过程中,方言的包含与排除对特定方言的表现有何影响;瑞士德语与标准德语之间的差异如何影响系统的性能?研究结果表明,方言多样性和语言差异对瑞士德语语音翻译构成了重大挑战,这与基于实证研究得出的语言学假设是一致的。