Machine Translation Quality Estimation (MTQE) is the task of estimating the quality of machine-translated text in real time without the need for reference translations, which is of great importance for the development of MT. After two decades of evolution, QE has yielded a wealth of results. This article provides a comprehensive overview of QE datasets, annotation methods, shared tasks, methodologies, challenges, and future research directions. It begins with an introduction to the background and significance of QE, followed by an explanation of the concepts and evaluation metrics for word-level QE, sentence-level QE, document-level QE, and explainable QE. The paper categorizes the methods developed throughout the history of QE into those based on handcrafted features, deep learning, and Large Language Models (LLMs), with a further division of deep learning-based methods into classic deep learning and those incorporating pre-trained language models (LMs). Additionally, the article details the advantages and limitations of each method and offers a straightforward comparison of different approaches. Finally, the paper discusses the current challenges in QE research and provides an outlook on future research directions.
翻译:机器翻译质量估计(MTQE)的任务是在无需参考译文的情况下实时评估机器翻译文本的质量,这对机器翻译的发展具有重要意义。经过二十年的演进,质量估计研究已取得丰硕成果。本文全面综述了质量估计的数据集、标注方法、共享任务、方法论、挑战及未来研究方向。文章首先介绍质量估计的研究背景与意义,随后阐释词汇级、句子级、文档级质量估计以及可解释质量估计的概念与评估指标。在此基础上,将质量估计发展历程中的方法归纳为基于手工特征、深度学习和大型语言模型(LLMs)三类,并将深度学习类方法进一步细分为经典深度学习方法和引入预训练语言模型(LMs)的方法。此外,本文详细阐述了各类方法的优势与局限,并对不同方法进行了直观比较。最后,论文讨论了当前质量估计研究面临的挑战,并对未来研究方向进行了展望。