Large language models (LLMs) demonstrate remarkable machine translation (MT) abilities via prompting, even though they were not explicitly trained for this task. However, even given the incredible quantities of data they are trained on, LLMs can struggle to translate inputs with rare words, which are common in low resource or domain transfer scenarios. We show that LLM prompting can provide an effective solution for rare words as well, by using prior knowledge from bilingual dictionaries to provide control hints in the prompts. We propose a novel method, DiPMT, that provides a set of possible translations for a subset of the input words, thereby enabling fine-grained phrase-level prompted control of the LLM. Extensive experiments show that DiPMT outperforms the baseline both in low-resource MT, as well as for out-of-domain MT. We further provide a qualitative analysis of the benefits and limitations of this approach, including the overall level of controllability that is achieved.
翻译:大语言模型(LLMs)通过提示机制展现出卓越的机器翻译(MT)能力,即便未经过明确训练。然而,即便拥有海量训练数据,大语言模型在翻译包含稀有词汇的输入时仍存在困难,这常见于低资源或领域迁移场景。我们证明,通过利用双语词典的既有知识在提示中提供控制线索,大语言模型提示机制同样能为稀有词提供有效解决方案。我们提出创新方法DiPMT,该方法为部分输入词提供一系列候选译项,从而实现对大语言模型进行细粒度短语级提示控制。大量实验表明,DiPMT在低资源机器翻译和跨领域机器翻译中均优于基线方法。我们进一步对该方法的优势与局限性进行定性分析,包括所实现的整体可控性水平。