Transformer is a state-of-the-art model in the field of natural language processing (NLP). Current NLP models primarily increase the number of transformers to improve processing performance. However, this technique requires a lot of training resources such as computing capacity. In this paper, a novel structure of Transformer is proposed. It is featured by full layer normalization, weighted residual connection, positional encoding exploiting reinforcement learning, and zero masked self-attention. The proposed Transformer model, which is called Enhanced Transformer, is validated by the bilingual evaluation understudy (BLEU) score obtained with the Multi30k translation dataset. As a result, the Enhanced Transformer achieves 202.96% higher BLEU score as compared to the original transformer with the translation dataset.
翻译:Transformer是自然语言处理(NLP)领域中的一种先进模型。当前NLP模型主要通过增加Transformer数量来提升处理性能,然而该技术需要大量计算能力等训练资源。本文提出了一种新型Transformer结构,其特点包括全层归一化、加权残差连接、利用强化学习的位置编码以及零掩码自注意力机制。该模型被称为增强型Transformer(Enhanced Transformer),通过Multi30k翻译数据集的双语评估替身(BLEU)分数进行验证。结果表明,与原始Transformer相比,增强型Transformer在翻译数据集上的BLEU分数提升了202.96%。