We propose a novel framework based on the attention mechanism to identify the sentiment of a movie review document. Previous efforts on deep neural networks with attention mechanisms focus on encoder and decoder with fixed numbers of multi-head attention. Therefore, we need a mechanism to stop the attention process automatically if no more useful information can be read from the memory.In this paper, we propose an adaptive multi-head attention architecture (AdaptAttn) which varies the number of attention heads based on length of sentences. AdaptAttn has a data preprocessing step where each document is classified into any one of the three bins small, medium or large based on length of the sentence. The document classified as small goes through two heads in each layer, the medium group passes four heads and the large group is processed by eight heads. We examine the merit of our model on the Stanford large movie review dataset. The experimental results show that the F1 score from our model is on par with the baseline model.
翻译:我们提出一种基于注意力机制的新型框架,用于识别电影评论文档的情感倾向。以往基于注意力机制的深度神经网络研究,主要关注具有固定多头注意力数量的编码器和解码器。因此,当记忆体中无法读取更多有效信息时,需要一种机制自动终止注意力处理过程。本文提出一种自适应多头注意力架构(AdaptAttn),该架构根据句子长度动态调整注意力头数。AdaptAttn包含数据预处理步骤,将每个文档按句子长度划分为小、中、大三类:短句文档每层使用两个注意力头,中等长度文档使用四个头,长句文档则使用八个头。我们在斯坦福大型电影评论数据集上验证了该模型的有效性。实验结果表明,本模型的F1分数与基线模型相当。