Binary neural networks utilize 1-bit quantized weights and activations to reduce both the model's storage demands and computational burden. However, advanced binary architectures still incorporate millions of inefficient and nonhardware-friendly full-precision multiplication operations. A&B BNN is proposed to directly remove part of the multiplication operations in a traditional BNN and replace the rest with an equal number of bit operations, introducing the mask layer and the quantized RPReLU structure based on the normalizer-free network architecture. The mask layer can be removed during inference by leveraging the intrinsic characteristics of BNN with straightforward mathematical transformations to avoid the associated multiplication operations. The quantized RPReLU structure enables more efficient bit operations by constraining its slope to be integer powers of 2. Experimental results achieved 92.30%, 69.35%, and 66.89% on the CIFAR-10, CIFAR-100, and ImageNet datasets, respectively, which are competitive with the state-of-the-art. Ablation studies have verified the efficacy of the quantized RPReLU structure, leading to a 1.14% enhancement on the ImageNet compared to using a fixed slope RLeakyReLU. The proposed add&bit-operation-only BNN offers an innovative approach for hardware-friendly network architecture.
翻译:二值神经网络利用1比特量化权重和激活值来降低模型存储需求与计算负担。然而,先进的二值架构仍包含数百万个低效且非硬件友好的全精度乘法运算。为直接去除传统BNN中的部分乘法运算,并将剩余部分替换为等量的位运算,本文提出A&B BNN,引入掩码层和基于无归一化网络架构的量化RPReLU结构。通过利用BNN的固有特性并结合简洁的数学变换,掩码层可在推理阶段被移除,从而避免相关乘法运算。量化RPReLU结构通过将其斜率约束为2的整数次幂,实现了更高效的位运算。实验结果显示,该方法在CIFAR-10、CIFAR-100和ImageNet数据集上分别达到92.30%、69.35%和66.89%的准确率,与现有最优方法性能相当。消融研究验证了量化RPReLU结构的有效性:相较于采用固定斜率的RLeakyReLU,该方法在ImageNet上带来1.14%的性能提升。所提出的仅含加法与位运算的BNN为硬件友好型网络架构提供了创新思路。