Brain-inspired algorithms are attractive and emerging alternatives to classical deep learning methods for use in various machine learning applications. Brain-inspired systems can feature local learning rules, both unsupervised/semi-supervised learning and different types of plasticity (structural/synaptic), allowing them to potentially be faster and more energy-efficient than traditional machine learning alternatives. Among the more salient brain-inspired algorithms are Bayesian Confidence Propagation Neural Networks (BCPNNs). BCPNN is an important tool for both machine learning and computational neuroscience research, and recent work shows that BCPNN can reach state-of-the-art performance in tasks such as learning and memory recall compared to other models. Unfortunately, BCPNN is primarily executed on slow general-purpose processors (CPUs) or power-hungry graphics processing units (GPUs), reducing the applicability of using BCPNN in (among others) Edge systems. In this work, we design a custom stream-based accelerator for BCPNN using Field-Programmable Gate Arrays (FPGA) using Xilinx Vitis High-Level Synthesis (HLS) flow. Furthermore, we model our accelerator's performance using first principles, and we empirically show that our proposed accelerator is between 1.3x - 5.3x faster than an Nvidia A100 GPU while at the same time consuming between 2.62x - 3.19x less power and 5.8x - 16.5x less energy without any degradation in performance.
翻译:类脑算法是经典深度学习方法在各种机器学习应用中有吸引力且新兴的替代方案。类脑系统可以具备局部学习规则、无监督/半监督学习以及不同类型的可塑性(结构/突触),使其可能比传统的机器学习替代方案更快、更节能。在较为突出的类脑算法中,贝叶斯置信传播神经网络(BCPNN)是其中之一。BCPNN是机器学习和计算神经科学研究的重要工具,近期研究表明,与其他模型相比,BCPNN在学习和记忆回忆等任务中可以达到最先进的性能。然而,BCPNN主要在速度较慢的通用处理器(CPU)或功耗较高的图形处理单元(GPU)上执行,这限制了BCPNN在(尤其是)边缘系统等场景中的应用。在本工作中,我们利用现场可编程门阵列(FPGA)和Xilinx Vitis高层次综合(HLS)流程,为BCPNN设计了一种定制的流式加速器。此外,我们基于基本原理对加速器的性能进行了建模,并通过实验证明,我们提出的加速器比Nvidia A100 GPU快1.3倍至5.3倍,同时功耗降低2.62倍至3.19倍,能耗降低5.8倍至16.5倍,且性能没有任何下降。