Orthogonal recurrent neural networks (ORNNs) are an appealing option for learning tasks involving time series with long-term dependencies, thanks to their simplicity and computational stability. However, these networks often require a substantial number of parameters to perform well, which can be prohibitive in power-constrained environments, such as compact devices. One approach to address this issue is neural network quantization. The construction of such networks remains an open problem, acknowledged for its inherent instability.In this paper, we explore the quantization of the recurrent and input weight matrices in ORNNs, leading to Quantized approximately Orthogonal RNNs (QORNNs). We investigate one post-training quantization (PTQ) strategy and three quantization-aware training (QAT) algorithms that incorporate orthogonal constraints and quantized weights. Empirical results demonstrate the advantages of employing QAT over PTQ. The most efficient model achieves results similar to state-of-the-art full-precision ORNN and LSTM on a variety of standard benchmarks, even with 3-bits quantization.
翻译:正交循环神经网络因其简洁性和计算稳定性,成为处理具有长期依赖关系的时间序列学习任务中颇具吸引力的选择。然而,这类网络通常需要大量参数才能达到良好性能,这在功耗受限环境(如紧凑型设备)中可能成为制约因素。解决该问题的一种方法是神经网络量化,但此类网络的构建仍被视为一个开放性问题,其固有的不稳定性已得到公认。本文探索了正交循环网络中循环权重矩阵与输入权重矩阵的量化方法,从而提出量化近似正交循环神经网络。我们研究了一种训练后量化策略和三种融合正交约束与量化权重的量化感知训练算法。实证结果表明,采用量化感知训练相较于训练后量化具有明显优势。即便在3比特量化条件下,最高效的模型仍能在多项标准基准测试中取得与全精度正交循环神经网络及长短期记忆网络相当的性能。