The recent successes and spread of large neural language models (LMs) call for a thorough understanding of their computational ability. Describing their computational abilities through LMs' \emph{representational capacity} is a lively area of research. However, investigation into the representational capacity of neural LMs has predominantly focused on their ability to \emph{recognize} formal languages. For example, recurrent neural networks (RNNs) with Heaviside activations are tightly linked to regular languages, i.e., languages defined by finite-state automata (FSAs). Such results, however, fall short of describing the capabilities of RNN \emph{language models} (LMs), which are definitionally \emph{distributions} over strings. We take a fresh look at the representational capacity of RNN LMs by connecting them to \emph{probabilistic} FSAs and demonstrate that RNN LMs with linearly bounded precision can express arbitrary regular LMs.
翻译:大型神经语言模型(LM)近年来的成功与普及,要求我们对其计算能力进行深入理解。通过语言模型的**表征能力**来描述其计算能力,已成为一个活跃的研究领域。然而,对神经语言模型表征能力的研究主要集中于其**识别**形式语言的能力。例如,采用Heaviside激活函数的循环神经网络(RNN)与正则语言(即由有限状态自动机(FSA)定义的语言)紧密关联。然而,此类结果不足以描述RNN**语言模型**(LM)的能力,因为语言模型在定义上是字符串上的**分布**。我们通过将RNN语言模型与**概率型**FSA联系起来,重新审视其表征能力,并证明具有线性有界精度的RNN语言模型可以表达任意的正则语言模型。