Artificial neural networks (ANNs) with recurrence and self-attention have been shown to be Turing-complete (TC). However, existing work has shown that these ANNs require multiple turns or unbounded computation time, even with unbounded precision in weights, in order to recognize TC grammars. However, under constraints such as fixed or bounded precision neurons and time, ANNs without memory are shown to struggle to recognize even context-free languages. In this work, we extend the theoretical foundation for the $2^{nd}$-order recurrent network ($2^{nd}$ RNN) and prove there exists a class of a $2^{nd}$ RNN that is Turing-complete with bounded time. This model is capable of directly encoding a transition table into its recurrent weights, enabling bounded time computation and is interpretable by design. We also demonstrate that $2$nd order RNNs, without memory, under bounded weights and time constraints, outperform modern-day models such as vanilla RNNs and gated recurrent units in recognizing regular grammars. We provide an upper bound and a stability analysis on the maximum number of neurons required by $2$nd order RNNs to recognize any class of regular grammar. Extensive experiments on the Tomita grammars support our findings, demonstrating the importance of tensor connections in crafting computationally efficient RNNs. Finally, we show $2^{nd}$ order RNNs are also interpretable by extraction and can extract state machines with higher success rates as compared to first-order RNNs. Our results extend the theoretical foundations of RNNs and offer promising avenues for future explainable AI research.
翻译:具有递归和自注意力机制的人工神经网络(ANNs)已被证明具有图灵完备性(TC)。然而,现有研究表明,这些ANNs需要多轮计算或无限计算时间(即使在权重精度无限的情况下)才能识别TC文法。在固定或有限精度神经元及时间约束下,无记忆ANNs甚至难以识别上下文无关语言。本研究扩展了二阶递归神经网络($2^{nd}$ RNN)的理论基础,证明存在一类在有限时间内具有图灵完备性的$2^{nd}$ RNN模型。该模型能够直接将转移表编码至递归权重中,实现有限时间计算且具有固有可解释性。我们还证明,在有限权重与时间约束下,无记忆的二阶RNN在正则文法识别任务上优于现代模型(如标准RNN和门控循环单元)。我们为二阶RNN识别任意正则文法所需的最大神经元数量提供了上界估计与稳定性分析。基于Tomita文法的广泛实验支持了我们的发现,揭示了张量连接在构建计算高效型RNN中的重要性。最后,我们证明二阶RNN可通过提取实现可解释性,且其提取状态机的成功率显著高于一阶RNN。本研究扩展了RNN的理论基础,为未来可解释人工智能研究开辟了新的方向。