Previous work has established that RNNs with an unbounded activation function have the capacity to count exactly. However, it has also been shown that RNNs are challenging to train effectively and generally do not learn exact counting behaviour. In this paper, we focus on this problem by studying the simplest possible RNN, a linear single-cell network. We conduct a theoretical analysis of linear RNNs and identify conditions for the models to exhibit exact counting behaviour. We provide a formal proof that these conditions are necessary and sufficient. We also conduct an empirical analysis using tasks involving a Dyck-1-like Balanced Bracket language under two different settings. We observe that linear RNNs generally do not meet the necessary and sufficient conditions for counting behaviour when trained with the standard approach. We investigate how varying the length of training sequences and utilising different target classes impacts model behaviour during training and the ability of linear RNN models to effectively approximate the indicator conditions.
翻译:先前研究已证明,具有无界激活函数的RNN具备精确计数的能力。然而,研究也表明RNN难以有效训练,通常无法学习精确计数行为。本文聚焦这一问题,通过研究最简单的RNN——线性单细胞网络展开分析。我们对线性RNN进行理论分析,识别出模型展现精确计数行为的条件,并给出这些条件充分必要性的形式化证明。我们还采用类似Dyck-1平衡括号语言的任务,在两种不同设置下进行实证分析。观察到线性RNN在标准训练方法下通常无法满足计数行为的充要条件。我们探究了训练序列长度变化及采用不同目标类别如何影响模型训练行为,以及线性RNN模型有效近似指示条件的能力。