Machine translation (MT) requires a wide range of linguistic capabilities, which current end-to-end models are expected to learn implicitly by observing aligned sentences in bilingual corpora. In this work, we ask: \emph{How well do MT models learn coreference resolution from implicit signal?} To answer this question, we develop an evaluation methodology that derives coreference clusters from MT output and evaluates them without requiring annotations in the target language. We further evaluate several prominent open-source and commercial MT systems, translating from English to six target languages, and compare them to state-of-the-art coreference resolvers on three challenging benchmarks. Our results show that the monolingual resolvers greatly outperform MT models. Motivated by this result, we experiment with different methods for incorporating the output of coreference resolution models in MT, showing improvement over strong baselines.
翻译:机器翻译需要广泛的 linguistic 能力,当前的端到端模型预期通过观察双语语料库中的对齐句子隐式学习这些能力。本研究提出以下问题:\emph{机器翻译模型能否从隐式信号中有效学习共指消解?} 为解答此问题,我们开发了一种评估方法,该方法从机器翻译输出中提取共指集群,并在无需目标语言标注的情况下对其进行评估。我们进一步评估了多个主流开源和商业机器翻译系统(从英语翻译至六种目标语言),并将其与三个具有挑战性基准上的最新共指消解模型进行对比。结果显示,单语共指消解模型的性能显著优于机器翻译模型。受此启发,我们实验了多种将共指消解模型输出融入机器翻译的方法,并在强基线基础上取得了改进。