It is well known that canonical recurrent neural networks (RNNs) face limitations in learning long-term dependencies which have been addressed by memory structures in long short-term memory (LSTM) networks. Neural Turing machines (NTMs) are novel RNNs that implement the notion of programmable computers with neural network controllers that can learn simple algorithmic tasks. Matrix neural networks feature matrix representation which inherently preserves the spatial structure of data when compared to canonical neural networks that use vector-based representation. One may then argue that neural networks with matrix representations may have the potential to provide better memory capacity. In this paper, we define and study a probabilistic notion of memory capacity based on Fisher information for matrix-based RNNs. We find bounds on memory capacity for such networks under various hypotheses and compare them with their vector counterparts. In particular, we show that the memory capacity of such networks is bounded by $N^2$ for $N\times N$ state matrix which generalizes the one known for vector networks. We also show and analyze the increase in memory capacity for such networks which is introduced when one exhibits an external state memory, such as NTMs. Consequently, we construct NTMs with RNN controllers with matrix-based representation of external memory, leading us to introduce Matrix NTMs. We demonstrate the performance of this class of memory networks under certain algorithmic learning tasks such as copying and recall and compare it with Matrix RNNs. We find an improvement in the performance of Matrix NTMs by the addition of external memory, in comparison to Matrix RNNs.
翻译:众所周知,标准递归神经网络在学习长期依赖关系方面存在局限性,而长短期记忆网络中的记忆结构已解决这一问题。神经图灵机作为一种新型递归神经网络,实现了可编程计算机的概念,其神经网络控制器能够学习简单的算法任务。矩阵神经网络采用矩阵表示,相较于使用向量表示的标准神经网络,能天然保留数据的空间结构。据此可推断,具有矩阵表示的神经网络可能具备更强的记忆容量潜力。本文基于Fisher信息定义了矩阵型递归神经网络记忆容量的概率概念并展开研究。我们推导了此类网络在不同假设下的记忆容量界限,并将其与向量型网络进行比较。特别地,研究表明对于$N\times N$状态矩阵的网络,其记忆容量以$N^2$为界,这推广了向量型网络的已知结论。我们还证明并分析了此类网络在引入外部状态记忆(如神经图灵机)时记忆容量的提升现象。据此,我们构建了采用矩阵表示外部记忆的神经图灵机(其控制器为递归神经网络),从而提出Matrix NTM。我们通过复制与召回等算法学习任务演示了这类记忆网络的性能,并与Matrix RNN进行比较。实验发现,相较于Matrix RNN,Matrix NTM通过引入外部记忆获得了性能提升。