Large language models (LLMs) have shown surprisingly good performance in multilingual neural machine translation (MNMT) even when trained without parallel data. Yet, despite the fact that the amount of training data is gigantic, they still struggle with translating rare words, particularly for low-resource languages. Even worse, it is usually unrealistic to retrieve relevant demonstrations for in-context learning with low-resource languages on LLMs, which restricts the practical use of LLMs for translation -- how should we mitigate this problem? To this end, we present a novel method, CoD, which augments LLMs with prior knowledge with the chains of multilingual dictionaries for a subset of input words to elicit translation abilities for LLMs. Extensive experiments indicate that augmenting ChatGPT with CoD elicits large gains by up to 13x chrF++ points for MNMT (3.08 to 42.63 for English to Serbian written in Cyrillic script) on FLORES-200 full devtest set. We further demonstrate the importance of chaining the multilingual dictionaries, as well as the superiority of CoD to few-shot demonstration for low-resource languages.
翻译:大语言模型(LLMs)即使在未经平行数据训练的情况下,也在多语言神经机器翻译(MNMT)中展现出惊人的优异性能。然而,尽管训练数据量巨大,它们仍然难以翻译罕见词汇,尤其是对于低资源语言。更糟糕的是,在LLMs上为低资源语言检索相关的上下文学习示例通常不切实际,这限制了LLMs在翻译中的实际应用——我们应如何缓解这一问题?为此,我们提出了一种新颖的方法CoD,该方法通过为输入单词的一个子集提供多语言词典链,将先验知识增强到LLMs中,以激发LLMs的翻译能力。大量实验表明,在FLORES-200完整开发测试集上,用CoD增强ChatGPT可带来高达13倍chrF++分数的巨大提升(英语到西里尔字母书写的塞尔维亚语从3.08提升至42.63)。我们进一步证明了链接多语言词典的重要性,以及CoD相对于低资源语言少样本示例学习的优越性。