The Surprising Computational Power of Nondeterministic Stack RNNs

Traditional recurrent neural networks (RNNs) have a fixed, finite number of memory cells. In theory (assuming bounded range and precision), this limits their formal language recognition power to regular languages, and in practice, RNNs have been shown to be unable to learn many context-free languages (CFLs). In order to expand the class of languages RNNs recognize, prior work has augmented RNNs with a nondeterministic stack data structure, putting them on par with pushdown automata and increasing their language recognition power to CFLs. Nondeterminism is needed for recognizing all CFLs (not just deterministic CFLs), but in this paper, we show that nondeterminism and the neural controller interact to produce two more unexpected abilities. First, the nondeterministic stack RNN can recognize not only CFLs, but also many non-context-free languages. Second, it can recognize languages with much larger alphabet sizes than one might expect given the size of its stack alphabet. Finally, to increase the information capacity in the stack and allow it to solve more complicated tasks with large alphabet sizes, we propose a new version of the nondeterministic stack that simulates stacks of vectors rather than discrete symbols. We demonstrate perplexity improvements with this new model on the Penn Treebank language modeling benchmark.

翻译：传统递归神经网络（RNN）具有固定且有限数量的记忆单元。理论上（假设有界范围和精度），这将其形式语言识别能力限制在正则语言范围内；实践中，RNN已被证明无法学习许多上下文无关语言（CFL）。为扩展RNN可识别的语言类别，先前的工作通过引入非确定栈数据结构增强RNN，使其等同于下推自动机，语言识别能力提升至CFL级别。识别所有CFL（不仅限于确定性CFL）需要非确定性，但本文证明，非确定性与神经控制器相互作用，产生了另外两种出乎意料的能力：第一，非确定栈RNN不仅能识别CFL，还能识别许多非上下文无关语言；第二，它能识别的语言字母表规模远大于根据栈字母表大小所能预期的范围。最后，为提升栈的信息容量并解决更大字母表规模的复杂任务，我们提出了非确定栈的新版本，该版本模拟向量栈而非离散符号栈。在宾州树库语言建模基准上，我们展示了新模型的困惑度改进。

相关内容

Alphabet

关注 1

Alphabet is mostly a collection of companies. This newer Google is a bit slimmed down, with the companies that are pretty far afield of our main internet products contained in Alphabet instead.

https://abc.xyz/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务