Recently, nature-inspired computing approaches have gained significant attention for solving difficult optimization problems, particularly through Ising machines for NP-complete applications. Existing Ising accelerators range from quantum and optical annealers to CMOS-based von-Neumann and in-memory architectures. However, many prior designs are specialized accelerators limited to specific problem classes, rely on ADC/DAC circuits, and suffer from reliability challenges due to process-variation-sensitive embedded memory technologies. This paper presents SACHI, an all-digital Ising architecture implemented by repurposing the L1 cache of a CPU using SRAM-based processing-in-memory techniques. SACHI eliminates the need for ADCs/DACs, improves reliability compared to prior approaches such as BRIM, and enables Ising acceleration with minimal hardware overhead integrated into the CPU pipeline. The paper also provides detailed architectural analysis and pseudo-code for the proposed algorithms. The key contributions of SACHI are: (i) tight integration of the accelerator with the CPU pipeline, (ii) reuse of existing cache hardware for acceleration, (iii) higher parallelism enabled through reuse-aware computation, and (iv) improved performance and energy efficiency for large-scale, high-precision optimization problems using novel compute and mapping strategies. Compared to BRIM, SACHI achieves 300x performance improvement and 80x energy reduction across applications including asset allocation, molecular dynamics, image segmentation, and traveling salesman problems. Additionally, reuse factors up to 4000x are observed for several workloads. This work demonstrates that reliable and efficient all-digital Ising acceleration can be achieved using commodity SRAM structures tightly integrated with general-purpose processors.
翻译:近期,受自然启发的计算方法在解决困难优化问题方面备受关注,尤其是通过伊辛机解决NP完全应用。现有的伊辛加速器涵盖从量子加速器和光退火器到基于CMOS的冯·诺依曼架构和存内架构。然而,许多现有设计是局限于特定问题类别的专用加速器,依赖模数转换器/数模转换器,并因对工艺偏差敏感的嵌入式存储技术而面临可靠性挑战。本文提出SACHI,一种通过利用SRAM基存内计算技术重新利用CPU一级缓存实现的全数字伊辛架构。SACHI消除了对模数转换器/数模转换器的需求,相比BRIM等先前方法提高了可靠性,并通过集成到CPU流水线中以最小硬件开销实现伊辛加速。本文还提供了详细的架构分析和所提算法的伪代码。SACHI的主要贡献在于:(i) 加速器与CPU流水线的紧耦合集成,(ii) 利用现有缓存硬件进行加速,(iii) 通过复用感知计算实现更高并行性,以及(iv) 通过新颖计算和映射策略提升大规模高精度优化问题的性能与能效。与BRIM相比,SACHI在资产配置、分子动力学、图像分割和旅行商问题等应用中实现了300倍的性能提升和80倍的能耗降低。此外,多个工作负载的复用因子高达4000倍。本研究证明,通过将商用SRAM结构与通用处理器紧耦合,可实现可靠且高效的全数字伊辛加速。