ReLU and Addition-based Gated RNN

We replace the multiplication and sigmoid function of the conventional recurrent gate with addition and ReLU activation. This mechanism is designed to maintain long-term memory for sequence processing but at a reduced computational cost, thereby opening up for more efficient execution or larger models on restricted hardware. Recurrent Neural Networks (RNNs) with gating mechanisms such as LSTM and GRU have been widely successful in learning from sequential data due to their ability to capture long-term dependencies. Conventionally, the update based on current inputs and the previous state history is each multiplied with dynamic weights and combined to compute the next state. However, multiplication can be computationally expensive, especially for certain hardware architectures or alternative arithmetic systems such as homomorphic encryption. It is demonstrated that the novel gating mechanism can capture long-term dependencies for a standard synthetic sequence learning task while significantly reducing computational costs such that execution time is reduced by half on CPU and by one-third under encryption. Experimental results on handwritten text recognition tasks furthermore show that the proposed architecture can be trained to achieve comparable accuracy to conventional GRU and LSTM baselines. The gating mechanism introduced in this paper may enable privacy-preserving AI applications operating under homomorphic encryption by avoiding the multiplication of encrypted variables. It can also support quantization in (unencrypted) plaintext applications, with the potential for substantial performance gains since the addition-based formulation can avoid the expansion to double precision often required for multiplication.

翻译：我们提出用加法运算和ReLU激活函数替代传统循环门控中的乘法运算和sigmoid函数。该机制旨在以更低的计算代价维持序列处理中的长期记忆能力，从而在受限硬件上实现更高效率的执行或更大规模的模型。具有LSTM和GRU等门控机制的循环神经网络（RNN）因其捕获长期依赖关系的能力而在序列数据学习中取得了广泛成功。传统方法中，基于当前输入和先前状态历史的更新各自与动态权重相乘，再组合计算下一个状态。然而，乘法运算可能带来较高的计算成本，尤其在特定硬件架构或同态加密等替代算术系统中。研究表明，新型门控机制能在标准合成序列学习任务中捕获长期依赖关系，同时显著降低计算成本——在CPU上执行时间减少一半，在加密环境下减少三分之一。手写文本识别任务的实验结果进一步表明，所提出的架构可通过训练达到与传统GRU和LSTM基线模型相当的精度。本文引入的门控机制通过避免加密变量的乘法运算，可为同态加密下的隐私保护人工智能应用提供支持。它还能支持（非加密）明文应用中的量化操作，由于基于加法的公式可避免乘法运算中常需的双精度扩展，因此具有显著提升性能的潜力。