Fully Homomorphic Encryption (FHE) imposes substantial memory bandwidth demands, presenting significant challenges for efficient hardware acceleration. Near-memory Processing (NMP) has emerged as a promising architectural solution to alleviate the memory bottleneck. However, the irregular memory access patterns and flexible dataflows inherent to FHE limit the effectiveness of existing NMP accelerators, which fail to fully utilize the available near-memory bandwidth. In this work, we propose FlexMem, a near-memory accelerator featuring high-parallel computational units with varying memory access strides and interconnect topologies to effectively handle irregular memory access patterns. Furthermore, we design polynomial and ciphertext-level dataflows to efficiently utilize near-memory bandwidth under varying degrees of polynomial parallelism and enhance parallel performance. Experimental results demonstrate that FlexMem achieves 1.12 times of performance improvement over state-of-the-art near-memory architectures, with 95.7% of near-memory bandwidth utilization.
翻译:全同态加密(FHE)对内存带宽提出了极高要求,为高效硬件加速带来了重大挑战。近内存处理(NMP)已成为缓解内存瓶颈的一种前景广阔的架构解决方案。然而,FHE固有的不规则内存访问模式和灵活数据流限制了现有NMP加速器的效能,使其无法充分利用可用的近内存带宽。本文提出FlexMem,一种近内存加速器,其特点是采用具有不同内存访问步长和互连拓扑的高并行计算单元,以有效处理不规则内存访问模式。此外,我们设计了多项式级和密文级数据流,以在不同程度的多项式并行性下高效利用近内存带宽,并提升并行性能。实验结果表明,FlexMem相比最先进的近内存架构实现了1.12倍的性能提升,近内存带宽利用率达到95.7%。