阿姆哈拉语的低语：针对低资源语言微调Whisper模型 (Whispering in Amharic: Fine-tuning Whisper for Low-resource Language)

Dawit Ketema Gete,Bedru Yimam Ahmed,Tadesse Destaw Belay,Yohannes Ayana Ejigu,Sukairaj Hafiz Imam,Alemu Belay Tessema,Mohammed Oumer Adem,Tadesse Amare Belay,Robert Geislinger,Umma Aliyu Musa,Martin Semmann,Shamsuddeen Hassan Muhammad,Henning Schreiber,Seid Muhie Yimam

This work explores fine-tuning OpenAI's Whisper automatic speech recognition (ASR) model for Amharic, a low-resource language, to improve transcription accuracy. While the foundational Whisper model struggles with Amharic due to limited representation in its training data, we fine-tune it using datasets like Mozilla Common Voice, FLEURS, and the BDU-speech dataset. The best-performing model, Whispersmall-am, significantly improves when finetuned on a mix of existing FLEURS data and new, unseen Amharic datasets. Training solely on new data leads to poor performance, but combining it with FLEURS data reinforces the model, enabling better specialization in Amharic. We also demonstrate that normalizing Amharic homophones significantly enhances Word Error Rate (WER) and Bilingual Evaluation Understudy (BLEU) scores. This study underscores the importance of fine-tuning strategies and dataset composition for improving ASR in low-resource languages, providing insights for future Amharic speech recognition research.

翻译：本研究探索了针对低资源语言阿姆哈拉语微调OpenAI的Whisper自动语音识别模型，以提升其转录准确率。基础Whisper模型因其训练数据中阿姆哈拉语表征有限而表现不佳，我们使用Mozilla Common Voice、FLEURS及BDU-speech等数据集对其进行微调。性能最佳的模型Whispersmall-am在混合使用现有FLEURS数据与新的未见阿姆哈拉语数据集进行微调后，性能显著提升。仅使用新数据训练会导致模型表现不佳，但将其与FLEURS数据结合则能增强模型能力，使其更好地适应阿姆哈拉语。我们还证明，对阿姆哈拉语同音词进行归一化处理能显著改善词错误率与BLEU分数。本研究强调了微调策略与数据集构成对于改进低资源语言自动语音识别的重要性，为未来阿姆哈拉语语音识别研究提供了重要见解。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日