Fully Homomorphic Encryption (FHE) is known to be extremely computationally-intensive, application-specific accelerators emerged as a powerful solution to narrow the performance gap. Nonetheless, due to the increasing complexities in FHE schemes per se and multi-scheme FHE algorithm designs in end-to-end privacy-preserving tasks, existing FHE accelerators often face the challenges of low hardware utilization rates and insufficient memory bandwidth. In this work, we present \NAME, a layered near-memory computing hierarchy tailored for multi-scheme FHE acceleration. By closely inspecting the data flow across different FHE schemes, we propose a layered near-memory computing architecture with fine-grained functional unit design to significantly enhance the utilization rates of computational resources and memory bandwidth. The experimental results illustrate that APACHE outperforms state-of-the-art ASIC FHE accelerators by 10.63x to 35.47x over a variety of application benchmarks, e.g., Lola MNIST, HELR, VSP, and HE$^{3}$DB.
翻译:全同态加密(FHE)因其极高的计算复杂度而著称,专用加速器已成为缩小其性能差距的有力解决方案。然而,由于FHE方案本身日益复杂,以及在端到端隐私保护任务中多方案FHE算法设计的出现,现有FHE加速器常面临硬件利用率低和内存带宽不足的挑战。本文提出\NAME,一种专为多方案FHE加速设计的层次化近内存计算架构。通过深入分析不同FHE方案间的数据流,我们提出了一种具有细粒度功能单元设计的层次化近内存计算架构,以显著提升计算资源与内存带宽的利用率。实验结果表明,在多种应用基准测试(如Lola MNIST、HELR、VSP和HE$^{3}$DB)中,APACHE的性能优于最先进的ASIC FHE加速器10.63倍至35.47倍。