Bayes2IMC: In-Memory Computing for Bayesian Binary Neural Networks

Bayesian Neural Networks (BNNs) provide superior estimates of uncertainty by generating an ensemble of predictive distributions. However, inference via ensembling is resource-intensive, requiring additional entropy sources to generate stochasticity which increases resource consumption. We introduce Bayes2IMC, an in-memory computing (IMC) architecture designed for binary Bayesian neural networks that leverage nanoscale device stochasticity to generate desired distributions. Our novel approach utilizes Phase-Change Memory (PCM) to harness inherent noise characteristics, enabling the creation of a binary neural network. This design eliminates the necessity for a pre-neuron Analog-to-Digital Converter (ADC), significantly improving power and area efficiency. We also develop a hardware-software co-optimized correction method applied solely on the logits in the final layer to reduce device-induced accuracy variations across deployments on hardware. Additionally, we devise a simple compensation technique that ensures no drop in classification accuracy despite conductance drift of PCM. We validate the effectiveness of our approach on the CIFAR-10 dataset with a VGGBinaryConnect model, achieving accuracy metrics comparable to ideal software implementations as well as results reported in the literature using other technologies. Finally, we present a complete core architecture and compare its projected power, performance, and area efficiency against an equivalent SRAM baseline, showing a $3.8$ to $9.6 \times$ improvement in total efficiency (in GOPS/W/mm$^2$) and a $2.2 $ to $5.6 \times$ improvement in power efficiency (in GOPS/W). In addition, the projected hardware performance of Bayes2IMC surpasses that of most of the BNN architectures based on memristive devices reported in the literature, and achieves up to $20\%$ higher power efficiency compared to the state-of-the-art.

翻译：贝叶斯神经网络通过生成一组预测分布来提供更优的不确定性估计。然而，基于集成的推理过程资源消耗巨大，需要额外的熵源来产生随机性，从而进一步增加资源开销。本文提出Bayes2IMC，一种专为二值贝叶斯神经网络设计的存内计算架构，该架构利用纳米级器件的固有随机性来生成所需的分布。我们的创新方法利用相变存储器固有的噪声特性，实现了二值神经网络的构建。该设计消除了神经元前模拟-数字转换器的需求，显著提升了功耗与面积效率。我们还开发了一种软硬件协同优化的校正方法，该方法仅作用于最终层的逻辑值，以减少器件特性在硬件部署过程中引起的精度波动。此外，我们设计了一种简易的补偿技术，确保即使相变存储器发生电导漂移，分类精度也不会下降。我们在CIFAR-10数据集上使用VGGBinaryConnect模型验证了所提方法的有效性，其精度指标与理想的软件实现相当，也与文献中采用其他技术报道的结果相媲美。最后，我们展示了一个完整的核心架构，并将其预估的功耗、性能与面积效率与等效的SRAM基准进行比较，结果显示总体效率（以GOPS/W/mm$^2$计）提升了$3.8$至$9.6$倍，能效（以GOPS/W计）提升了$2.2$至$5.6$倍。此外，Bayes2IMC的预估硬件性能超越了文献中报道的大多数基于忆阻器件的贝叶斯神经网络架构，并且与最先进的技术相比，能效最高可提升$20\%$。

相关内容

Neural Networks

关注 1653

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日