Audience-specific Explanations for Machine Translation

In machine translation, a common problem is that the translation of certain words even if translated can cause incomprehension of the target language audience due to different cultural backgrounds. A solution to solve this problem is to add explanations for these words. In a first step, we therefore need to identify these words or phrases. In this work we explore techniques to extract example explanations from a parallel corpus. However, the sparsity of sentences containing words that need to be explained makes building the training dataset extremely difficult. In this work, we propose a semi-automatic technique to extract these explanations from a large parallel corpus. Experiments on English->German language pair show that our method is able to extract sentence so that more than 10% of the sentences contain explanation, while only 1.9% of the original sentences contain explanations. In addition, experiments on English->French and English->Chinese language pairs also show similar conclusions. This is therefore an essential first automatic step to create a explanation dataset. Furthermore we show that the technique is robust for all three language pairs.

翻译：在机器翻译中，一个常见问题是某些词语即使被翻译，由于文化背景差异仍可能导致目标语言受众理解困难。解决该问题的一种方法是为这些词语添加解释。因此，我们首先需要识别这些词语或短语。本研究探索了从平行语料库中提取示例解释的技术。然而，包含需要解释词语的句子稀疏性使得构建训练数据集极为困难。本文提出了一种半自动技术，可从大规模平行语料库中提取这些解释。在英语→德语语言对上的实验表明，我们的方法能够提取句子，使得超过10%的句子包含解释，而原始句子中仅1.9%包含解释。此外，在英语→法语和英语→中文语言对上的实验也得出了类似结论。因此，这是创建解释数据集的关键首个自动步骤。进一步地，我们证明了该技术对所有三种语言对均具有鲁棒性。

相关内容

Machine Translation

关注 210

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日