We propose a novel tensor network language model based on the simplest tensor network (i.e., tensor trains), called `Tensor Train Language Model' (TTLM). TTLM represents sentences in an exponential space constructed by the tensor product of words, but computing the probabilities of sentences in a low-dimensional fashion. We demonstrate that the architectures of Second-order RNNs, Recurrent Arithmetic Circuits (RACs), and Multiplicative Integration RNNs are, essentially, special cases of TTLM. Experimental evaluations on real language modeling tasks show that the proposed variants of TTLM (i.e., TTLM-Large and TTLM-Tiny) outperform the vanilla Recurrent Neural Networks (RNNs) with low-scale of hidden units. (The code is available at https://github.com/shuishen112/tensortrainlm.)
翻译:我们提出了一种基于最简单张量网络(即张量序列)的新型张量网络语言模型,称为"张量序列语言模型"(TTLM)。TTLM在由词的张量积构成的指数空间中表示句子,但以低维方式计算句子的概率。我们证明了二阶循环神经网络、循环算术电路(RAC)以及乘法集成循环神经网络的结构本质上都是TTLM的特殊情况。在真实语言建模任务上的实验评估表明,所提出的TTLM变体(即TTLM-Large和TTLM-Tiny)在低规模隐藏单元条件下优于标准循环神经网络(RNNs)。(代码可在https://github.com/shuishen112/tensortrainlm获取。)