Breakthroughs in deep learning and memory networks have made major advances in natural language understanding. Language is sequential and information carried through the sequence can be captured through memory networks. Learning the sequence is one of the key aspects in learning the language. However, memory networks are not capable of holding infinitely long sequences in their memories and are limited by various constraints such as the vanishing or exploding gradient problem. Therefore, natural language understanding models are affected when presented with long sequential text. We introduce Long Term Memory network (LTM) to learn from infinitely long sequences. LTM gives priority to the current inputs to allow it to have a high impact. Language modeling is an important factor in natural language understanding. LTM was tested in language modeling, which requires long term memory. LTM is tested on Penn Tree bank dataset, Google Billion Word dataset and WikiText-2 dataset. We compare LTM with other language models which require long term memory.
翻译:深度学习和记忆网络的突破性进展极大地推动了自然语言理解。语言具有序列性,通过序列传递的信息可通过记忆网络捕获。学习序列是语言学习的关键方面之一。然而,记忆网络无法在其记忆中保存无限长的序列,且受到梯度消失或爆炸等问题的限制。因此,自然语言理解模型在处理长序列文本时会受到影响。我们引入了长时记忆网络(LTM)来学习无限长的序列。LTM对当前输入赋予高优先级,使其产生显著影响。语言建模是自然语言理解中的关键因素。LTM在需要长时记忆的语言建模任务上进行了测试。我们在Penn Treebank数据集、Google Billion Word数据集和WikiText-2数据集上对LTM进行了评估,并将其与需要长时记忆的其他语言模型进行了对比。