Many common character-level, string-to string transduction tasks, e.g., grapheme-tophoneme conversion and morphological inflection, consist almost exclusively of monotonic transductions. However, neural sequence-to sequence models that use non-monotonic soft attention often outperform popular monotonic models. In this work, we ask the following question: Is monotonicity really a helpful inductive bias for these tasks? We develop a hard attention sequence-to-sequence model that enforces strict monotonicity and learns a latent alignment jointly while learning to transduce. With the help of dynamic programming, we are able to compute the exact marginalization over all monotonic alignments. Our models achieve state-of-the-art performance on morphological inflection. Furthermore, we find strong performance on two other character-level transduction tasks. Code is available at https://github.com/shijie-wu/neural-transducer.
翻译:许多常见的字符级、字符串到字符串的转导任务,例如字形到音素转换和形态屈折变化,几乎完全由单调转导组成。然而,使用非单调软注意力的神经序列到序列模型通常优于流行的单调模型。在这项工作中,我们提出以下问题:单调性是否真的是这些任务中一种有用的归纳偏差?我们开发了一种硬注意力序列到序列模型,该模型强制执行严格的单调性,并在学习转导的同时联合学习潜在对齐。借助动态规划,我们能够计算所有单调对齐的精确边际化。我们的模型在形态屈折变化上达到了最先进的性能。此外,我们在其他两个字符级转导任务上也发现了强大的性能。代码可在 https://github.com/shijie-wu/neural-transducer 获取。