Operators devoid of multiplication, such as Shift and Add, have gained prominence for their compatibility with hardware. However, neural networks (NNs) employing these operators typically exhibit lower accuracy compared to conventional NNs with identical structures. ShiftAddAug uses costly multiplication to augment efficient but less powerful multiplication-free operators, improving performance without any inference overhead. It puts a ShiftAdd tiny NN into a large multiplicative model and encourages it to be trained as a sub-model to obtain additional supervision. In order to solve the weight discrepancy problem between hybrid operators, a new weight sharing method is proposed. Additionally, a novel two stage neural architecture search is used to obtain better augmentation effects for smaller but stronger multiplication-free tiny neural networks. The superiority of ShiftAddAug is validated through experiments in image classification and semantic segmentation, consistently delivering noteworthy enhancements. Remarkably, it secures up to a 4.95% increase in accuracy on the CIFAR100 compared to its directly trained counterparts, even surpassing the performance of multiplicative NNs.
翻译:无乘法运算符(如移位和加法)因其硬件友好性而备受关注。然而,采用这些运算符的神经网络在结构相同的情况下,通常比传统神经网络精度更低。ShiftAddAug 通过引入高成本乘法运算来增强高效但性能较弱的无乘法运算符,从而在不增加推理开销的前提下提升性能。该方法将 ShiftAdd 微型神经网络嵌入大型乘法模型中,并鼓励其作为子模型进行训练以获得额外监督。为解决混合运算符间的权重失配问题,本文提出了一种新的权重共享方法。此外,通过新颖的两阶段神经架构搜索,可为更小更强的无乘法微型神经网络获得更优的增强效果。ShiftAddAug 的优越性在图像分类和语义分割实验中得到了验证,均取得了显著提升。值得注意的是,在 CIFAR100 数据集上,其精度相比直接训练的对照模型最高提升 4.95%,甚至超越了采用乘法运算的神经网络性能。