Efficient utilisation of both intra- and extra-textual context remains one of the critical gaps between machine and human translation. Existing research has primarily focused on providing individual, well-defined types of context in translation, such as the surrounding text or discrete external variables like the speaker's gender. This work introduces MTCue, a novel neural machine translation (NMT) framework that interprets all context (including discrete variables) as text. MTCue learns an abstract representation of context, enabling transferability across different data settings and leveraging similar attributes in low-resource scenarios. With a focus on a dialogue domain with access to document and metadata context, we extensively evaluate MTCue in four language pairs in both translation directions. Our framework demonstrates significant improvements in translation quality over a parameter-matched non-contextual baseline, as measured by BLEU (+0.88) and Comet (+1.58). Moreover, MTCue significantly outperforms a "tagging" baseline at translating English text. Analysis reveals that the context encoder of MTCue learns a representation space that organises context based on specific attributes, such as formality, enabling effective zero-shot control. Pre-training on context embeddings also improves MTCue's few-shot performance compared to the "tagging" baseline. Finally, an ablation study conducted on model components and contextual variables further supports the robustness of MTCue for context-based NMT.
翻译:高效利用文本内与文本外上下文仍然是机器翻译与人工翻译之间的关键差距之一。现有研究主要关注在翻译中提供单一、定义明确的上下文类型,例如周围文本或离散的外部变量(如说话者性别)。本文提出了MTCue,一种新型神经机器翻译(NMT)框架,它将所有上下文(包括离散变量)解释为文本。MTCue学习上下文的抽象表示,从而能够在不同数据设置间实现可迁移性,并在低资源场景下利用相似属性。本文聚焦于可访问文档和元数据上下文的对话领域,在四个语言对的双向翻译中全面评估了MTCue。该框架在翻译质量上相较于参数匹配的无上下文基线取得了显著提升,BLEU指标提升+0.88,Comet指标提升+1.58。此外,MTCue在翻译英语文本时显著优于“标记”基线。分析表明,MTCue的上下文编码器学习了一个表示空间,该空间基于特定属性(如正式程度)组织上下文,从而实现了有效的零样本控制。与“标记”基线相比,上下文嵌入的预训练还提升了MTCue的小样本性能。最后,针对模型组件和上下文变量的消融研究进一步支持了MTCue在基于上下文的NMT中的鲁棒性。