Structured prediction tasks, like machine translation, involve learning functions that map structured inputs to structured outputs. Recurrent Neural Networks (RNNs) have historically been a popular choice for such tasks, including in natural language processing (NLP) applications. However, training RNNs using Maximum Likelihood Estimation (MLE) has its limitations, including exposure bias and a mismatch between training and testing metrics. SEARNN, based on the learning to search (L2S) framework, has been proposed as an alternative to MLE for RNN training. This project explored the potential of SEARNN to improve machine translation for low-resourced African languages -- a challenging task characterized by limited training data availability and the morphological complexity of the languages. Through experiments conducted on translation for English to Igbo, French to \ewe, and French to \ghomala directions, this project evaluated the efficacy of SEARNN over MLE in addressing the unique challenges posed by these languages. With an average BLEU score improvement of $5.4$\% over the MLE objective, we proved that SEARNN is indeed a viable algorithm to effectively train RNNs on machine translation for low-resourced languages.
翻译:结构化预测任务,如机器翻译,涉及学习从结构化输入到结构化输出的映射函数。循环神经网络(RNN)历来是此类任务(包括自然语言处理(NLP)应用)中的热门选择。然而,使用最大似然估计(MLE)训练RNN存在局限性,包括曝光偏差以及训练与测试指标之间的不匹配。基于学习搜索(L2S)框架的SEARNN已被提出作为MLE的替代方案用于RNN训练。本项目探索了SEARNN在改善低资源非洲语言机器翻译方面的潜力——这是一项具有挑战性的任务,其特点是训练数据有限且语言形态复杂。通过在英语到伊博语、法语到埃维语、法语到戈马语方向的翻译实验,本项目评估了SEARNN相较于MLE在应对这些语言特有挑战时的有效性。平均BLEU评分相比MLE目标提高了5.4%,我们证明了SEARNN确实是一种能够有效训练RNN进行低资源语言机器翻译的可行算法。