Fully homomorphic encryption (FHE) is in the spotlight as a definitive solution for privacy, but the high computational overhead of FHE poses a challenge to its practical adoption. Although prior studies have attempted to design ASIC accelerators to mitigate the overhead, their designs require excessive amounts of chip resources (e.g., areas) to contain and process massive data for FHE operations. We propose CiFHER, a chiplet-based FHE accelerator with a resizable structure, to tackle the challenge with a cost-effective multi-chip module (MCM) design. First, we devise a flexible architecture of a chiplet core whose configuration can be adjusted to conform to the global organization of chiplets and design constraints. The distinctive feature of our core is a recomposable functional unit providing varying computational throughput for number-theoretic transform (NTT), the most dominant function in FHE. Then, we establish generalized data mapping methodologies to minimize the network overhead when organizing the chips into the MCM package in a tiled manner, which becomes a significant bottleneck due to the technology constraints of MCMs. Also, we analyze the effectiveness of various algorithms, including a novel limb duplication algorithm, on the MCM architecture. A detailed evaluation shows that a CiFHER package composed of 4 to 64 compact chiplets provides performance comparable to state-of-the-art monolithic ASIC FHE accelerators with significantly lower package-wide power consumption while reducing the area of a single core to as small as 4.28mm$^2$.
翻译:全同态加密(FHE)作为隐私保护的终极解决方案备受关注,但其高昂的计算开销对其实际应用构成了挑战。尽管已有研究尝试通过设计专用集成电路(ASIC)加速器来降低开销,但这些设计需要消耗过量的芯片资源(如面积)来容纳和处理FHE操作中的海量数据。我们提出CiFHER——一种基于芯粒且结构可重构的全同态加密加速器,通过经济高效的多芯片模块(MCM)设计应对这一挑战。首先,我们设计了灵活的芯粒核架构,其配置可根据芯粒全局组织方式和设计约束进行调整。该核的独特之处在于可重构功能单元,能为FHE中最核心的数论变换(NTT)提供可变的计算吞吐量。随后,我们建立了通用数据映射方法,以最小化在MCM封装中按瓦片方式组织芯粒时的网络开销——这种开销因MCM工艺限制成为显著瓶颈。此外,我们还分析了包括新型肢体复制算法在内的多种算法在MCM架构中的有效性。详细评估表明,由4至64个紧凑型芯粒组成的CiFHER封装,其性能可与最先进的单片式ASIC FHE加速器相媲美,同时显著降低整个封装功耗,且单个芯粒面积可缩小至仅4.28平方毫米。