Decomposed Prompting for Machine Translation Between Related Languages using Large Language Models

This study investigates machine translation between related languages i.e., languages within the same family that share similar linguistic traits such as word order and lexical similarity. Machine translation through few-shot prompting leverages a small set of translation pair examples to generate translations for test sentences. This requires the model to learn how to generate translations while simultaneously ensuring that token ordering is maintained to produce a fluent and accurate translation. We propose that for related languages, the task of machine translation can be simplified by leveraging the monotonic alignment characteristic of such languages. We introduce a novel approach of few-shot prompting that decomposes the translation process into a sequence of word chunk translations. Through evaluations conducted on multiple related language pairs across various language families, we demonstrate that our novel approach of decomposed prompting surpasses multiple established few-shot baseline models, thereby verifying its effectiveness. For example, our model outperforms the strong few-shot prompting BLOOM model with an average improvement of 4.2 chrF++ scores across the examined languages.

翻译：本研究探讨了相关语言之间的机器翻译，即属于同一语系且共享相似语言特征（如词序和词汇相似性）的语言。通过少样本提示的机器翻译利用少量翻译对示例为测试句子生成翻译，这要求模型在保持词序以确保翻译流畅准确的同时，学习如何生成翻译。我们提出，对于相关语言，机器翻译任务可通过利用这些语言的单调对齐特性得到简化。我们引入了一种新颖的少样本提示方法，将翻译过程分解为一系列词块翻译。通过在多个语系中多种相关语言对上的评估，我们证明了这种分解提示方法超越了多个已有的少样本基线模型，从而验证了其有效性。例如，我们的模型在考察的语言上平均比强大的少样本提示BLOOM模型高出4.2个chrF++分数。

相关内容

Machine Translation

关注 210

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

73+阅读 · 2022年7月11日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日