MTCue: Learning Zero-Shot Control of Extra-Textual Attributes by Leveraging Unstructured Context in Neural Machine Translation

Efficient utilisation of both intra- and extra-textual context remains one of the critical gaps between machine and human translation. Existing research has primarily focused on providing individual, well-defined types of context in translation, such as the surrounding text or discrete external variables like the speaker's gender. This work introduces MTCue, a novel neural machine translation (NMT) framework that interprets all context (including discrete variables) as text. MTCue learns an abstract representation of context, enabling transferability across different data settings and leveraging similar attributes in low-resource scenarios. With a focus on a dialogue domain with access to document and metadata context, we extensively evaluate MTCue in four language pairs in both translation directions. Our framework demonstrates significant improvements in translation quality over a parameter-matched non-contextual baseline, as measured by BLEU (+0.88) and Comet (+1.58). Moreover, MTCue significantly outperforms a "tagging" baseline at translating English text. Analysis reveals that the context encoder of MTCue learns a representation space that organises context based on specific attributes, such as formality, enabling effective zero-shot control. Pre-training on context embeddings also improves MTCue's few-shot performance compared to the "tagging" baseline. Finally, an ablation study conducted on model components and contextual variables further supports the robustness of MTCue for context-based NMT.

翻译：高效利用文本内与文本外上下文仍然是机器翻译与人工翻译之间的关键差距之一。现有研究主要关注在翻译中提供单一、定义明确的上下文类型，例如周围文本或离散的外部变量（如说话者性别）。本文提出了MTCue，一种新型神经机器翻译（NMT）框架，它将所有上下文（包括离散变量）解释为文本。MTCue学习上下文的抽象表示，从而能够在不同数据设置间实现可迁移性，并在低资源场景下利用相似属性。本文聚焦于可访问文档和元数据上下文的对话领域，在四个语言对的双向翻译中全面评估了MTCue。该框架在翻译质量上相较于参数匹配的无上下文基线取得了显著提升，BLEU指标提升+0.88，Comet指标提升+1.58。此外，MTCue在翻译英语文本时显著优于“标记”基线。分析表明，MTCue的上下文编码器学习了一个表示空间，该空间基于特定属性（如正式程度）组织上下文，从而实现了有效的零样本控制。与“标记”基线相比，上下文嵌入的预训练还提升了MTCue的小样本性能。最后，针对模型组件和上下文变量的消融研究进一步支持了MTCue在基于上下文的NMT中的鲁棒性。

相关内容

Machine Translation

关注 210

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

【Max Welling】图神经网络知识表示与推荐，Graph Neural Networks for Knowledge Representation and Recommendation

专知会员服务

44+阅读 · 2022年3月4日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日