MTUncertainty: Assessing the Need for Post-editing of Machine Translation Outputs by Fine-tuning OpenAI LLMs

Translation Quality Evaluation (TQE) is an essential step of the modern translation production process. TQE is critical in assessing both machine translation (MT) and human translation (HT) quality without reference translations. The ability to evaluate or even simply estimate the quality of translation automatically may open significant efficiency gains through process optimisation. This work examines whether the state-of-the-art large language models (LLMs) can be used for this purpose. We take OpenAI models as the best state-of-the-art technology and approach TQE as a binary classification task. On eight language pairs including English to Italian, German, French, Japanese, Dutch, Portuguese, Turkish, and Chinese, our experimental results show that fine-tuned gpt3.5 can demonstrate good performance on translation quality prediction tasks, i.e. whether the translation needs to be edited. Another finding is that simply increasing the sizes of LLMs does not lead to apparent better performances on this task by comparing the performance of three different versions of OpenAI models: curie, davinci, and gpt3.5 with 13B, 175B, and 175B parameters, respectively.

翻译：翻译质量评估（TQE）是现代翻译生产流程中的关键环节。TQE对于在无参考译文的情况下评估机器翻译（MT）和人工翻译（HT）的质量至关重要。自动评估乃至简单估算翻译质量的能力，可通过流程优化带来显著的效率提升。本研究探讨了当前最先进的大语言模型（LLMs）能否用于此目的。我们将OpenAI模型视为当前最佳技术，并将TQE视为二元分类任务。在包括英语到意大利语、德语、法语、日语、荷兰语、葡萄牙语、土耳其语和中文的八个语言对上，实验结果表明，经过微调的gpt3.5在翻译质量预测任务（即判断译文是否需要编辑）上能表现出良好性能。另一项发现是，通过比较参数规模分别为130亿、1750亿和1750亿的三种不同版本OpenAI模型（curie、davinci和gpt3.5）的性能，单纯增加LLM的规模并未在此任务上带来明显更好的表现。

相关内容

Machine Translation

关注 210

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日