The spiking neural network (SNN) using leaky-integrated-and-fire (LIF) neurons has been commonly used in automatic speech recognition (ASR) tasks. However, the LIF neuron is still relatively simple compared to that in the biological brain. Further research on more types of neurons with different scales of neuronal dynamics is necessary. Here we introduce four types of neuronal dynamics to post-process the sequential patterns generated from the spiking transformer to get the complex dynamic neuron improved spiking transformer neural network (DyTr-SNN). We found that the DyTr-SNN could handle the non-toy automatic speech recognition task well, representing a lower phoneme error rate, lower computational cost, and higher robustness. These results indicate that the further cooperation of SNNs and neural dynamics at the neuron and network scales might have much in store for the future, especially on the ASR tasks.
翻译:基于漏积分点火(LIF)神经元的脉冲神经网络(SNN)已广泛应用于自动语音识别(ASR)任务中。然而,与生物大脑中的神经元相比,LIF神经元仍相对简单。有必要进一步研究具有不同尺度神经元动力学的更多类型神经元。本文引入四种神经元动力学机制,对脉冲变压器生成的序列模式进行后处理,从而构建出复杂动态神经元改进的脉冲变压器神经网络(DyTr-SNN)。我们发现,DyTr-SNN能够良好地处理非玩具级自动语音识别任务,表现出更低的音素错误率、更低的计算成本和更高的鲁棒性。这些结果表明,SNN与神经元及网络尺度的神经动力学进一步协同发展,尤其在ASR任务中,未来可能具有广阔前景。