This paper presents a novel architecture utilizing a 10T SRAM cell for XNOR-based in-memory computing, aimed at mitigating the extensive routing challenges typically encountered in conventional in-memory computing systems. By integrating a full adder between in-memory multiplication cells, the proposed design achieves a 50% reduction in routing complexity. The architecture performs multiply-accumulate (MAC) operations using XNOR computation optimized for binary neural networks (BNNs). Additionally, a 14T-based full adder is employed to construct an N-bit ripple carry adder in the adder tree, significantly reducing the area compared to traditional 28T-based CMOS designs. The 10T SRAM XNOR computation further enhances the latency for MAC operations. The proposed approach reduces the latency and area overhead, improving the overall hardware's area efficiency by 2.67x compared to the state-of-the-art.
翻译:本文提出了一种新颖的架构,利用10T SRAM单元实现基于XNOR的内存计算,旨在缓解传统内存计算系统中常见的复杂布线挑战。通过在内存乘法单元之间集成全加器,所提出的设计实现了布线复杂度降低50%。该架构采用针对二值神经网络(BNNs)优化的XNOR计算执行乘累加(MAC)操作。此外,采用基于14T的全加器构建加法树中的N位纹波进位加法器,相比传统的28T CMOS设计显著减少了面积。10T SRAM的XNOR计算进一步提升了MAC操作的延迟性能。所提出的方法降低了延迟和面积开销,使整体硬件的面积效率相比现有最优方案提升了2.67倍。