Models of dense prediction based on traditional Artificial Neural Networks (ANNs) require a lot of energy, especially for image restoration tasks. Currently, neural networks based on the SNN (Spiking Neural Network) framework are beginning to make their mark in the field of image restoration, especially as they typically use less than 10\% of the energy of ANNs with the same architecture. However, training an SNN is much more expensive than training an ANN, due to the use of the heuristic gradient descent strategy. In other words, the process of SNN's potential membrane signal changing from sparse to dense is very slow, which affects the convergence of the whole model.To tackle this problem, we propose a novel distillation technique, called asymmetric framework (ANN-SNN) distillation, in which the teacher is an ANN and the student is an SNN. Specifically, we leverage the intermediate features (feature maps) learned by the ANN as hints to guide the training process of the SNN. This approach not only accelerates the convergence of the SNN but also improves its final performance, effectively bridging the gap between the efficiency of the SNN and the superior learning capabilities of ANN. Extensive experimental results show that our designed SNN-based image restoration model, which has only 1/300 the number of parameters of the teacher network and 1/50 the energy consumption of the teacher network, is as good as the teacher network in some denoising tasks.
翻译:基于传统人工神经网络(ANN)的密集预测模型通常需要消耗大量能量,尤其在图像复原任务中。目前,基于脉冲神经网络(SNN)框架的神经网络正开始在图像复原领域崭露头角,特别是因其在相同架构下通常仅需消耗ANN不到10%的能量。然而,由于采用启发式梯度下降策略,训练SNN的成本远高于训练ANN。换言之,SNN膜电位信号从稀疏到密集的转变过程极为缓慢,这影响了整体模型的收敛速度。为解决该问题,我们提出一种称为非对称框架(ANN-SNN)蒸馏的新型蒸馏技术,其中教师网络为ANN,学生网络为SNN。具体而言,我们利用ANN学习到的中间特征(特征图)作为提示信息来指导SNN的训练过程。该方法不仅加速了SNN的收敛,还提升了其最终性能,有效弥合了SNN的高效性与ANN卓越学习能力之间的鸿沟。大量实验结果表明,我们设计的基于SNN的图像复原模型仅需教师网络1/300的参数量和1/50的能耗,在部分去噪任务中即可达到与教师网络相当的性能水平。