ShiftAddAug: Augment Multiplication-Free Tiny Neural Network with Hybrid Computation

Operators devoid of multiplication, such as Shift and Add, have gained prominence for their compatibility with hardware. However, neural networks (NNs) employing these operators typically exhibit lower accuracy compared to conventional NNs with identical structures. ShiftAddAug uses costly multiplication to augment efficient but less powerful multiplication-free operators, improving performance without any inference overhead. It puts a ShiftAdd tiny NN into a large multiplicative model and encourages it to be trained as a sub-model to obtain additional supervision. In order to solve the weight discrepancy problem between hybrid operators, a new weight sharing method is proposed. Additionally, a novel two stage neural architecture search is used to obtain better augmentation effects for smaller but stronger multiplication-free tiny neural networks. The superiority of ShiftAddAug is validated through experiments in image classification and semantic segmentation, consistently delivering noteworthy enhancements. Remarkably, it secures up to a 4.95% increase in accuracy on the CIFAR100 compared to its directly trained counterparts, even surpassing the performance of multiplicative NNs.

翻译：无乘法运算符（如移位和加法）因其硬件友好性而备受关注。然而，采用这些运算符的神经网络在结构相同的情况下，通常比传统神经网络精度更低。ShiftAddAug 通过引入高成本乘法运算来增强高效但性能较弱的无乘法运算符，从而在不增加推理开销的前提下提升性能。该方法将 ShiftAdd 微型神经网络嵌入大型乘法模型中，并鼓励其作为子模型进行训练以获得额外监督。为解决混合运算符间的权重失配问题，本文提出了一种新的权重共享方法。此外，通过新颖的两阶段神经架构搜索，可为更小更强的无乘法微型神经网络获得更优的增强效果。ShiftAddAug 的优越性在图像分类和语义分割实验中得到了验证，均取得了显著提升。值得注意的是，在 CIFAR100 数据集上，其精度相比直接训练的对照模型最高提升 4.95%，甚至超越了采用乘法运算的神经网络性能。

相关内容

Neural Networks

关注 1654

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日