The paper studies the capabilities of Recurrent-Neural-Network sequence to sequence (RNN seq2seq) models in learning four string-to-string transduction tasks: identity, reversal, total reduplication, and input-specified reduplication. These transductions are traditionally well studied under finite state transducers and attributed with varying complexity. We find that RNN seq2seq models are only able to approximate a mapping that fits the training or in-distribution data. Attention helps significantly, but does not solve the out-of-distribution generalization limitation. Task complexity and RNN variants also play a role in the results. Our results are best understood in terms of the complexity hierarchy of formal languages as opposed to that of string transductions.
翻译:本文研究了循环神经网络序列到序列(RNN seq2seq)模型在学习四种字符串到字符串转导任务中的能力:恒等映射、逆序、完全重叠和输入指定重叠。这些转导传统上在有限状态转换器框架下得到深入研究,并具有不同的复杂度层级。我们发现RNN seq2seq模型仅能近似拟合训练数据或分布内数据的映射。注意力机制虽有助于提升性能,但无法解决分布外泛化的局限性。任务复杂度和RNN变体也对结果产生影响。我们的研究结果表明,这些结果更适合通过形式语言的复杂度层级而非字符串转导的复杂度层级来理解。