Resolving semantic ambiguity has long been recognised as a central challenge in the field of machine translation. Recent work on benchmarking translation performance on ambiguous sentences has exposed the limitations of conventional Neural Machine Translation (NMT) systems, which fail to capture many of these cases. Large language models (LLMs) have emerged as a promising alternative, demonstrating comparable performance to traditional NMT models while introducing new paradigms for controlling the target outputs. In this paper, we study the capabilities of LLMs to translate ambiguous sentences containing polysemous words and rare word senses. We also propose two ways to improve the handling of such ambiguity through in-context learning and fine-tuning on carefully curated ambiguous datasets. Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions. Our research provides valuable insights into effectively adapting LLMs for disambiguation during machine translation.
翻译:语义消歧长期以来被认为是机器翻译领域的核心挑战。近期针对歧义句子的翻译性能基准测试揭示了传统神经机器翻译系统的局限性,此类系统难以处理大量歧义案例。大语言模型作为极具潜力的替代方案崭露头角,不仅展现出与传统神经机器翻译模型相当的性能,还引入了控制目标输出的新范式。本文研究了大语言模型翻译含有多义词及罕见词义的歧义句子的能力,同时提出两种改进方案:通过上下文学习和基于精心构建的歧义数据集微调来增强此类消歧能力。实验表明,在五组语言方向中的四组上,我们的方法能够匹配或超越DeepL和NLLB等最先进系统。本研究为如何在大语言模型机器翻译中有效适配消歧机制提供了重要启示。