Spiking neural networks (SNNs) have tremendous potential for energy-efficient neuromorphic chips due to their binary and event-driven architecture. SNNs have been primarily used in classification tasks, but limited exploration on image generation tasks. To fill the gap, we propose a Spiking-Diffusion model, which is based on the vector quantized discrete diffusion model. First, we develop a vector quantized variational autoencoder with SNNs (VQ-SVAE) to learn a discrete latent space for images. In VQ-SVAE, image features are encoded using both the spike firing rate and postsynaptic potential, and an adaptive spike generator is designed to restore embedding features in the form of spike trains. Next, we perform absorbing state diffusion in the discrete latent space and construct a spiking diffusion image decoder (SDID) with SNNs to denoise the image. Our work is the first to build the diffusion model entirely from SNN layers. Experimental results on MNIST, FMNIST, KMNIST, Letters, and Cifar10 demonstrate that Spiking-Diffusion outperforms the existing SNN-based generation model. We achieve FIDs of 37.50, 91.98, 59.23, 67.41, and 120.5 on the above datasets respectively, with reductions of 58.60\%, 18.75\%, 64.51\%, 29.75\%, and 44.88\% in FIDs compared with the state-of-art work. Our code will be available at \url{https://github.com/Arktis2022/Spiking-Diffusion}.
翻译:脉冲神经网络(SNNs)因其二元事件驱动架构,在低能耗神经形态芯片领域展现出巨大潜力。已有研究主要将SNNs应用于分类任务,但在图像生成任务中的探索十分有限。为填补这一空白,本文提出Spiking-Diffusion模型,该模型基于向量量化离散扩散方法。首先,我们构建了基于SNNs的向量量化变分自编码器(VQ-SVAE),用于学习图像的离散潜空间。在VQ-SVAE中,图像特征通过脉冲发放率与突触后电位进行联合编码,并设计了自适应脉冲生成器以脉冲序列形式恢复嵌入特征。随后,我们在离散潜空间中执行吸收态扩散过程,并构建基于SNNs的脉冲扩散图像解码器(SDID)进行图像去噪。本研究首次完全基于SNN层构建扩散模型。在MNIST、FMNIST、KMNIST、Letters和Cifar10数据集上的实验结果表明,Spiking-Diffusion优于现有基于SNNs的生成模型。我们分别在这些数据集上实现了37.50、91.98、59.23、67.41和120.5的FID值,相较当前最优方法分别降低了58.60%、18.75%、64.51%、29.75%和44.88%。相关代码将发布于\url{https://github.com/Arktis2022/Spiking-Diffusion}。