Despite the remarkable advancements in machine translation, the current sentence-level paradigm faces challenges when dealing with highly-contextual languages like Japanese. In this paper, we explore how context-awareness can improve the performance of the current Neural Machine Translation (NMT) models for English-Japanese business dialogues translation, and what kind of context provides meaningful information to improve translation. As business dialogue involves complex discourse phenomena but offers scarce training resources, we adapted a pretrained mBART model, finetuning on multi-sentence dialogue data, which allows us to experiment with different contexts. We investigate the impact of larger context sizes and propose novel context tokens encoding extra-sentential information, such as speaker turn and scene type. We make use of Conditional Cross-Mutual Information (CXMI) to explore how much of the context the model uses and generalise CXMI to study the impact of the extra-sentential context. Overall, we find that models leverage both preceding sentences and extra-sentential context (with CXMI increasing with context size) and we provide a more focused analysis on honorifics translation. Regarding translation quality, increased source-side context paired with scene and speaker information improves the model performance compared to previous work and our context-agnostic baselines, measured in BLEU and COMET metrics.
翻译:尽管机器翻译取得了显著进展,当前基于句子的翻译范式在处理日语等高度依赖上下文的语言时仍面临挑战。本文探讨了上下文感知如何提升当前英日商务对话翻译中神经机器翻译(NMT)模型的性能,以及何种上下文信息能提供有意义的翻译改进。由于商务对话涉及复杂的语篇现象但训练资源稀缺,我们基于预训练的mBART模型进行适配,通过多句子对话数据的微调,实现了对不同上下文类型的实验研究。我们考察了更大上下文窗口的影响,并提出了编码句外信息(如说话人轮次与场景类型)的新型上下文标记。利用条件互信息(CXMI)探究模型实际使用的上下文规模,并将其泛化以分析句外上下文的影响。总体而言,我们发现模型同时利用前文句子与句外上下文(其CXMI值随上下文规模增大而提升),并针对敬语翻译开展了重点分析。在翻译质量方面,与既有研究及不依赖上下文的基线模型相比,融合源端扩展上下文、场景及说话人信息的模型在BLEU和COMET评估指标上均实现了性能提升。