Recent work by Hewitt et al. (2020) provides a possible interpretation of the empirical success of recurrent neural networks (RNNs) as language models (LMs). It shows that RNNs can efficiently represent bounded hierarchical structures that are prevalent in human language. This suggests that RNNs' success might be linked to their ability to model hierarchy. However, a closer inspection of Hewitt et al.'s (2020) construction shows that it is not limited to hierarchical LMs, posing the question of what \emph{other classes} of LMs can be efficiently represented by RNNs. To this end, we generalize their construction to show that RNNs can efficiently represent a larger class of LMs: Those that can be represented by a pushdown automaton with a bounded stack and a generalized stack update function. This is analogous to an automaton that keeps a memory of a fixed number of symbols and updates the memory with a simple update mechanism. Altogether, the efficiency in representing a diverse class of non-hierarchical LMs posits a lack of concrete cognitive and human-language-centered inductive biases in RNNs.
翻译:Hewitt等人(2020)的近期工作为循环神经网络作为语言模型的经验成功提供了一种可能的解释。该研究表明,RNN能够高效表示人类语言中普遍存在的有界层次结构,这意味着RNN的成功可能与其建模层次结构的能力有关。然而,仔细审视Hewitt等人(2020)的构造方法后可以发现,该方法并不仅限于层次化语言模型,这引发了一个问题:RNN还能高效表示哪些其他类别的语言模型?为此,我们推广了他们的构造方法,证明RNN能够高效表示更大一类语言模型:那些可由具有有界栈和广义栈更新函数的下推自动机表示的模型。这类似于一种通过简单更新机制保持固定数量符号记忆的自动机。总体而言,RNN在表示多种非层次化语言模型上的高效性,表明其缺乏以人类语言为中心的明确认知归纳偏置。