Current artificial neural networks are trained with parameters encoded as floating point numbers that occupy lots of memory space at inference time. Due to the increase in the size of deep learning models, it is becoming very difficult to consider training and using artificial neural networks on edge devices. Binary neural networks promise to reduce the size of deep neural network models, as well as to increase inference speed while decreasing energy consumption. Thus, they may allow the deployment of more powerful models on edge devices. However, binary neural networks are still proven to be difficult to train using the backpropagation-based gradient descent scheme. This paper extends the work of \cite{crulis2023alternatives}, which proposed adapting to binary neural networks two promising alternatives to backpropagation originally designed for continuous neural networks, and experimented with them on simple image classification datasets. This paper proposes new experiments on the ImageNette dataset, compares three different model architectures for image classification, and adds two additional alternatives to backpropagation.
翻译:当前的人工神经网络使用浮点数编码的参数进行训练,这些参数在推理时占用大量内存空间。随着深度学习模型规模的增大,在边缘设备上训练和使用人工神经网络变得非常困难。二值神经网络有望减小深度神经网络模型的尺寸,同时提高推理速度并降低能耗。因此,它们可能使得更强大的模型能够部署在边缘设备上。然而,基于反向传播的梯度下降方案在训练二值神经网络时仍被证明存在困难。本文扩展了\cite{crulis2023alternatives}的工作,该研究将两种原本为连续神经网络设计的、有前景的反向传播替代方法适配到二值神经网络中,并在简单的图像分类数据集上进行了实验。本文提出了在ImageNette数据集上的新实验,比较了三种不同的图像分类模型架构,并增加了两种额外的反向传播替代方法。