Unified Model Learning for Various Neural Machine Translation

Existing neural machine translation (NMT) studies mainly focus on developing dataset-specific models based on data from different tasks (e.g., document translation and chat translation). Although the dataset-specific models have achieved impressive performance, it is cumbersome as each dataset demands a model to be designed, trained, and stored. In this work, we aim to unify these translation tasks into a more general setting. Specifically, we propose a ``versatile'' model, i.e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks, and can translate well in multiple settings simultaneously, and theoretically it can be as many as possible. Through unified learning, UMLNMT is able to jointly train across multiple tasks, implementing intelligent on-demand translation. On seven widely-used translation tasks, including sentence translation, document translation, and chat translation, our UMLNMT results in substantial improvements over dataset-specific models with significantly reduced model deployment costs. Furthermore, UMLNMT can achieve competitive or better performance than state-of-the-art dataset-specific methods. Human evaluation and in-depth analysis also demonstrate the superiority of our approach on generating diverse and high-quality translations. Additionally, we provide a new genre translation dataset about famous aphorisms with 186k Chinese->English sentence pairs.

翻译：现有神经机器翻译（NMT）研究主要聚焦于基于不同任务（如文档翻译和聊天翻译）的数据开发数据集特定模型。尽管这些数据集特定模型已取得显著性能，但每个数据集都需要单独设计、训练和存储模型，过程繁琐。本工作旨在将这些翻译任务统一至更通用的框架中。具体而言，我们提出了一种“通用”模型——面向NMT的统一模型学习（UMLNMT），它能处理不同任务的数据，同时在多种设置下均具备良好翻译能力，理论上可支持无限扩展。通过统一学习，UMLNMT能够跨多个任务进行联合训练，实现智能按需翻译。在句子翻译、文档翻译和聊天翻译等七个广泛使用的翻译任务上，UMLNMT在大幅降低模型部署成本的同时，较数据集特定模型取得显著提升。此外，UMLNMT能够达到或超越最先进数据集特定方法的性能。人工评估与深入分析进一步证明了本方法在生成多样且高质量的翻译方面的优越性。此外，我们还提供了包含18.6万中英句子对的名言警句体裁翻译新数据集。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

35+阅读 · 2022年3月5日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

116+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日