Existing neural machine translation (NMT) studies mainly focus on developing dataset-specific models based on data from different tasks (e.g., document translation and chat translation). Although the dataset-specific models have achieved impressive performance, it is cumbersome as each dataset demands a model to be designed, trained, and stored. In this work, we aim to unify these translation tasks into a more general setting. Specifically, we propose a ``versatile'' model, i.e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks, and can translate well in multiple settings simultaneously, and theoretically it can be as many as possible. Through unified learning, UMLNMT is able to jointly train across multiple tasks, implementing intelligent on-demand translation. On seven widely-used translation tasks, including sentence translation, document translation, and chat translation, our UMLNMT results in substantial improvements over dataset-specific models with significantly reduced model deployment costs. Furthermore, UMLNMT can achieve competitive or better performance than state-of-the-art dataset-specific methods. Human evaluation and in-depth analysis also demonstrate the superiority of our approach on generating diverse and high-quality translations. Additionally, we provide a new genre translation dataset about famous aphorisms with 186k Chinese->English sentence pairs.
翻译:现有神经机器翻译研究主要聚焦于基于不同任务数据(如文档翻译和聊天翻译)开发数据专属模型。尽管这些数据专属模型取得了显著性能,但每个数据集都需要单独设计、训练和存储模型,导致流程繁琐。本文旨在将这些翻译任务统一至更通用的框架中。具体而言,我们提出一种“通用”模型——面向神经机器翻译的统一模型学习(UMLNMT),该模型可处理不同任务的数据,并能在多种翻译场景下同时表现优异,理论上可扩展至任意数量的任务。通过统一学习,UMLNMT能够跨多个任务进行联合训练,实现智能按需翻译。在包括句子翻译、文档翻译和聊天翻译在内的七项广泛使用的翻译任务上,UMLNMT相较于数据专属模型取得了显著改进,同时大幅降低了模型部署成本。此外,UMLNMT的性能可达到或超越现有最优的数据专属方法。人工评估与深度分析进一步证明了该方法在生成多样且高质量翻译方面的优越性。我们还提供了一个包含18.6万句中英句子对的名言警句体裁翻译新数据集。