Compute-in-memory (CiM)-based binary neural network (CiM-BNN) accelerators marry the benefits of CiM and ultra-low precision quantization, making them highly suitable for edge computing. However, CiM-enabled crossbar (Xbar) arrays are plagued with hardware non-idealities like parasitic resistances and device non-linearities that impair inference accuracy, especially in scaled technologies. In this work, we first analyze the impact of Xbar non-idealities on the inference accuracy of various CiM-BNNs, establishing that the unique properties of CiM-BNNs make them more prone to hardware non-idealities compared to higher precision deep neural networks (DNNs). To address this issue, we propose BinSparX, a training-free technique that mitigates non-idealities in CiM-BNNs. BinSparX utilizes the distinct attributes of BNNs to reduce the average current generated during the CiM operations in Xbar arrays. This is achieved by statically and dynamically sparsifying the BNN weights and activations, respectively (which, in the context of BNNs, is defined as reducing the number of +1 weights and activations). This minimizes the IR drops across the parasitic resistances, drastically mitigating their impact on inference accuracy. To evaluate our technique, we conduct experiments on ResNet-18 and VGG-small CiM-BNNs designed at the 7nm technology node using 8T-SRAM and 1T-1ReRAM. Our results show that BinSparX is highly effective in alleviating the impact of non-idealities, recouping the inference accuracy to near-ideal (software) levels in some cases and providing accuracy boost of up to 77.25%. These benefits are accompanied by energy reduction, albeit at the cost of mild latency/area increase.
翻译:基于存内计算(CiM)的二值神经网络(CiM-BNN)加速器结合了存内计算和超低精度量化的优势,使其非常适合边缘计算。然而,支持存内计算的交叉开关(Xbar)阵列受到寄生电阻和器件非线性等硬件非理想性的困扰,这会损害推理精度,尤其是在先进工艺节点中。在本工作中,我们首先分析了交叉阵列非理想性对各种CiM-BNN推理精度的影响,证实了CiM-BNN的独特属性使其比更高精度的深度神经网络(DNN)更容易受到硬件非理想性的影响。为了解决这个问题,我们提出了BinSparX,一种无需重新训练即可缓解CiM-BNN中非理想性的技术。BinSparX利用BNN的独特属性来减少交叉阵列中存内计算操作期间产生的平均电流。这是通过分别静态和动态地稀疏化BNN的权重和激活来实现的(在BNN的背景下,这被定义为减少+1权重和激活的数量)。这最大限度地减少了寄生电阻上的IR压降,从而极大地减轻了它们对推理精度的影响。为了评估我们的技术,我们在7nm工艺节点上使用8T-SRAM和1T-1ReRAM设计了ResNet-18和VGG-small CiM-BNN并进行实验。我们的结果表明,BinSparX在缓解非理想性影响方面非常有效,在某些情况下能将推理精度恢复到接近理想(软件)水平,并提供高达77.25%的精度提升。这些优势伴随着能耗的降低,尽管代价是轻微的延迟/面积增加。