DNNs are widely used but face significant computational costs due to matrix multiplications, especially from data movement between the memory and processing units. One promising approach is therefore Processing-in-Memory as it greatly reduces this overhead. However, most PIM solutions rely either on novel memory technologies that have yet to mature or bit-serial computations that have significant performance overhead and scalability issues. Our work proposes an in-SRAM digital multiplier, that uses a conventional memory to perform bit-parallel computations, leveraging multiple wordlines activation. We then introduce DAISM, an architecture leveraging this multiplier, which achieves up to two orders of magnitude higher area efficiency compared to the SOTA counterparts, with competitive energy efficiency.
翻译:深度神经网络(DNN)广泛应用,但矩阵乘法操作带来了巨大的计算开销,尤其是数据在存储单元与处理单元之间的搬运。因此,存内计算是一种有前景的方案,可大幅降低此类开销。然而,大多数存内计算方案要么依赖于尚未成熟的新型存储技术,要么采用位串行计算方式,这在性能与可扩展性方面存在显著问题。本文提出一种存内SRAM数字乘法器,利用传统存储器通过激活多条字线实现位并行计算。基于此乘法器,我们进一步提出DAISM架构,相比当前最先进的同类方案,其面积效率可提升两个数量级,同时具备有竞争力的能量效率。