Recurrent neural network (RNNs) that are capable of modeling long-distance dependencies are widely used in various speech tasks, eg., keyword spotting (KWS) and speech enhancement (SE). Due to the limitation of power and memory in low-resource devices, efficient RNN models are urgently required for real-world applications. In this paper, we propose an efficient RNN architecture, GhostRNN, which reduces hidden state redundancy with cheap operations. In particular, we observe that partial dimensions of hidden states are similar to the others in trained RNN models, suggesting that redundancy exists in specific RNNs. To reduce the redundancy and hence computational cost, we propose to first generate a few intrinsic states, and then apply cheap operations to produce ghost states based on the intrinsic states. Experiments on KWS and SE tasks demonstrate that the proposed GhostRNN significantly reduces the memory usage (~40%) and computation cost while keeping performance similar.
翻译:能够建模长距离依赖的循环神经网络(RNNs)被广泛应用于各种语音任务,例如关键词检测(KWS)和语音增强(SE)。由于低资源设备在功耗和内存方面的限制,实际应用迫切需要高效的RNN模型。本文提出一种高效的RNN架构——GhostRNN,它通过廉价操作降低隐藏状态的冗余。具体而言,我们观察到在训练好的RNN模型中,隐藏状态的部分维度与其他维度相似,这表明特定RNN中存在冗余。为了减少冗余从而降低计算成本,我们提出首先生成少量本征状态,然后基于这些本征状态应用廉价操作来产生幽灵状态。在KWS和SE任务上的实验表明,所提出的GhostRNN在保持性能相近的同时,显著降低了内存使用量(约40%)和计算成本。