Large Language Models (LLMs) have achieved impressive results in Machine Translation (MT). However, careful evaluations by human reveal that the translations produced by LLMs still contain multiple errors. Importantly, feeding back such error information into the LLMs can lead to self-refinement and result in improved translation performance. Motivated by these insights, we introduce a systematic LLM-based self-refinement translation framework, named \textbf{TEaR}, which stands for \textbf{T}ranslate, \textbf{E}stimate, \textbf{a}nd \textbf{R}efine, marking a significant step forward in this direction. Our findings demonstrate that 1) our self-refinement framework successfully assists LLMs in improving their translation quality across a wide range of languages, whether it's from high-resource languages to low-resource ones or whether it's English-centric or centered around other languages; 2) TEaR exhibits superior systematicity and interpretability; 3) different estimation strategies yield varied impacts, directly affecting the effectiveness of the final corrections. Additionally, traditional neural translation models and evaluation models operate separately, often focusing on singular tasks due to their limited capabilities, while general-purpose LLMs possess the capability to undertake both tasks simultaneously. We further conduct cross-model correction experiments to investigate the potential relationship between the translation and evaluation capabilities of general-purpose LLMs. Our code and data are available at https://github.com/fzp0424/self_correct_mt
翻译:大语言模型(LLMs)在机器翻译(MT)任务中已取得显著成果。然而,人工细致评估表明,LLMs生成的译文仍存在多种错误。值得注意的是,将这些错误信息反馈给LLMs可促使其进行自我修正,从而提升翻译性能。基于此发现,我们提出一种系统化的基于LLM的自修正翻译框架——\textbf{TEaR}(即\textbf{T}ranslate、\textbf{E}stimate、\textbf{a}nd \textbf{R}efine),标志着该方向的重要进展。我们的研究结果表明:1)该自修正框架能有效帮助LLMs提升多语言翻译质量,无论涉及从高资源语言到低资源语言的翻译,还是以英语为中心或围绕其他语言的翻译;2)TEaR框架展现出卓越的系统性与可解释性;3)不同的评估策略会产生差异化影响,直接影响最终修正效果。此外,传统神经翻译模型与评估模型相互独立,因其能力有限通常仅专注于单一任务,而通用大语言模型具备同时执行这两类任务的能力。我们进一步开展跨模型修正实验,以探究通用大语言模型中翻译能力与评估能力间的潜在关联。相关代码与数据已发布于 https://github.com/fzp0424/self_correct_mt。