Towards spoken dialect identification of Irish

from arxiv, Accepted to Interspeech 2023 Workshop of the 2nd Annual Meeting of the Special Interest Group of Under-resourced Languages Workshop, Dublin (SiGUL)

The Irish language is rich in its diversity of dialects and accents. This compounds the difficulty of creating a speech recognition system for the low-resource language, as such a system must contend with a high degree of variability with limited corpora. A recent study investigating dialect bias in Irish ASR found that balanced training corpora gave rise to unequal dialect performance, with performance for the Ulster dialect being consistently worse than for the Connacht or Munster dialects. Motivated by this, the present experiments investigate spoken dialect identification of Irish, with a view to incorporating such a system into the speech recognition pipeline. Two acoustic classification models are tested, XLS-R and ECAPA-TDNN, in conjunction with a text-based classifier using a pretrained Irish-language BERT model. The ECAPA-TDNN, particularly a model pretrained for language identification on the VoxLingua107 dataset, performed best overall, with an accuracy of 73%. This was further improved to 76% by fusing the model's outputs with the text-based model. The Ulster dialect was most accurately identified, with an accuracy of 94%, however the model struggled to disambiguate between the Connacht and Munster dialects, suggesting a more nuanced approach may be necessary to robustly distinguish between the dialects of Irish.

翻译：爱尔兰语方言和口音多样且丰富，这加剧了为这一低资源语言创建语音识别系统的难度，因为此类系统必须在语料库有限的情况下应对高度变异。最近一项关于爱尔兰语自动语音识别中方言偏差的研究发现，均衡的训练语料库会导致方言性能不均衡，其中阿尔斯特方言的表现始终低于康诺特或芒斯特方言。受此启发，本实验研究了爱尔兰语的口语方言识别，旨在将此类系统集成到语音识别流程中。测试了两种声学分类模型：XLS-R 和 ECAPA-TDNN，并结合基于文本的分类器（使用预训练的爱尔兰语BERT模型）。预训练于VoxLingua107数据集进行语言识别的ECAPA-TDNN模型总体表现最佳，准确率达73%。通过将该模型输出与文本模型融合，准确率进一步提升至76%。阿尔斯特方言的识别准确率最高，达94%，但模型难以区分康诺特和芒斯特方言，这表明可能需要更精细的方法来稳健区分爱尔兰语的各地方言。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日