Large language models (LLM) are increasingly strong contenders in machine translation. We study document-level translation, where some words cannot be translated without context from outside the sentence. We investigate the ability of prominent LLMs to utilize context by analyzing models' robustness to perturbed and randomized document context. We find that LLMs' improved document-translation performance is not always reflected in pronoun translation performance. We highlight the need for context-aware finetuning of LLMs with a focus on relevant parts of the context to improve their reliability for document-level translation.
翻译:大型语言模型(LLM)正日益成为机器翻译领域的有力竞争者。本研究聚焦于文档级翻译任务,其中某些词语的翻译需要依赖句子之外的上下文信息。我们通过分析模型对扰动和随机化文档上下文的鲁棒性,考察了主流LLM利用上下文的能力。研究发现,LLM在文档翻译性能上的提升并不总能体现在代词翻译性能上。我们强调需要对LLM进行上下文感知的微调,重点关注上下文中的相关部分,以提高其在文档级翻译任务中的可靠性。