Gates Are Not What You Need in RNNs

from arxiv, Published in Artificial Intelligence and Soft Computing. ICAISC 2023. Lecture Notes in Computer Science(), vol 14125. Springer, Cham., and is available online at https://doi.org/10.1007/978-3-031-42505-9_27

Recurrent neural networks have flourished in many areas. Consequently, we can see new RNN cells being developed continuously, usually by creating or using gates in a new, original way. But what if we told you that gates in RNNs are redundant? In this paper, we propose a new recurrent cell called Residual Recurrent Unit (RRU) which beats traditional cells and does not employ a single gate. It is based on the residual shortcut connection, linear transformations, ReLU, and normalization. To evaluate our cell's effectiveness, we compare its performance against the widely-used GRU and LSTM cells and the recently proposed Mogrifier LSTM on several tasks including, polyphonic music modeling, language modeling, and sentiment analysis. Our experiments show that RRU outperforms the traditional gated units on most of these tasks. Also, it has better robustness to parameter selection, allowing immediate application in new tasks without much tuning. We have implemented the RRU in TensorFlow, and the code is made available at https://github.com/LUMII-Syslab/RRU .

翻译：循环神经网络已在众多领域取得显著成果。因此，我们不断看到新型RNN单元被开发出来，通常是通过以新颖独创的方式创建或使用门控机制。但如果我们告诉您RNN中的门控机制是多余的，您会作何感想？本文提出一种名为残差循环单元（RRU）的新型循环单元，该单元无需任何门控即可超越传统单元结构。该单元基于残差捷径连接、线性变换、ReLU激活函数和归一化技术构建。为评估该单元的有效性，我们将其与广泛使用的GRU和LSTM单元以及近期提出的Mogrifier LSTM进行多任务性能对比，涵盖复调音乐建模、语言建模和情感分析等任务。实验表明，RRU在大多数任务中均优于传统门控单元。此外，该单元对参数选择具有更好的鲁棒性，可直接应用于新任务而无需大量调参。我们已在TensorFlow中实现RRU，代码开源于https://github.com/LUMII-Syslab/RRU。

相关内容

长短期记忆网络

关注 120

长短期记忆网络(LSTM)是一种用于深度学习领域的人工回归神经网络(RNN)结构。与标准的前馈神经网络不同，LSTM具有反馈连接。它不仅可以处理单个数据点(如图像)，还可以处理整个数据序列(如语音或视频)。例如，LSTM适用于未分段、连接的手写识别、语音识别、网络流量或IDSs(入侵检测系统)中的异常检测等任务。

Graph Transformer近期进展

专知会员服务

65+阅读 · 2023年1月5日

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日