SRAM-based Analog Compute-in-Memory (ACiM) demonstrates promising energy efficiency for deep neural network (DNN) processing. Although recent aggressive design strategies have led to successive improvements on efficiency, there is limited discussion regarding the accompanying inference accuracy challenges. Given the growing difficulty in validating ACiM circuits with full-scale DNNs, standardized modeling methodology and open-source inference simulator are urgently needed. This paper presents ASiM, a simulation framework specifically designed to assess inference quality, enabling comparisons of ACiM prototype chips and guiding design decisions. ASiM works as a plug-and-play tool that integrates seamlessly with the PyTorch ecosystem, offering speed and ease of use. Using ASiM, we conducted a comprehensive analysis of how various design factors impact DNN inference. We observed that activation encoding can tolerate certain levels of quantization noise, indicating a substantial potential for bit-parallel scheme to enhance energy efficiency. However, inference accuracy is susceptible to noise, as ACiM circuits typically use limited ADC dynamic range, making even small errors down to 1 LSB significantly deteriorates accuracy. This underscores the need for high design standards, especially for complex DNN models and challenging tasks. In response to these findings, we propose two solutions: Hybrid Compute-in-Memory architecture and majority voting to secure accurate computation of MSB cycles. These approaches improve inference quality while maintaining energy efficiency benefits of ACiM, offering promising pathways toward reliable ACiM deployment in real-world applications.
翻译:基于SRAM的模拟存内计算(ACiM)在深度神经网络处理方面展现出优异的能效潜力。尽管近期激进的设计策略已带来能效的持续提升,但对其伴随的推理精度挑战的讨论仍显不足。鉴于使用全规模深度神经网络验证ACiM电路的难度日益增加,标准化的建模方法和开源的推理仿真器已成为迫切需求。本文提出ASiM——一个专为评估推理质量而设计的仿真框架,能够对ACiM原型芯片进行横向比较并指导设计决策。ASiM作为即插即用工具,可与PyTorch生态系统无缝集成,兼具运行速度与易用性优势。通过ASiM,我们对各类设计因素如何影响深度神经网络推理进行了系统性分析。研究发现:激活编码可容忍特定程度的量化噪声,这预示着位并行方案在提升能效方面具有巨大潜力;然而推理精度对噪声极为敏感,由于ACiM电路通常采用有限的ADC动态范围,即使低至1 LSB的微小误差也会显著降低精度。这凸显了高标准设计的重要性,尤其对于复杂深度神经网络模型与高难度任务。基于这些发现,我们提出两种解决方案:混合存内计算架构与多数表决机制,以确保最高有效位周期的精确计算。这些方法在保持ACiM能效优势的同时提升了推理质量,为ACiM在实际应用中的可靠部署提供了可行路径。