Fully homomorphic encryption (FHE) is in the spotlight as a definitive solution for privacy, but the high computational overhead of FHE poses a challenge to its practical adoption. Although prior studies have attempted to design ASIC accelerators to mitigate the overhead, their designs require excessive amounts of chip resources (e.g., areas) to contain and process massive data for FHE operations. We propose CiFHER, a chiplet-based FHE accelerator with a resizable structure, to tackle the challenge with a cost-effective multi-chip module (MCM) design. First, we devise a flexible architecture of a chiplet core whose configuration can be adjusted to conform to the global organization of chiplets and design constraints. The distinctive feature of our core is a recomposable functional unit providing varying computational throughput for number-theoretic transform (NTT), the most dominant function in FHE. Then, we establish generalized data mapping methodologies to minimize the network overhead when organizing the chips into the MCM package in a tiled manner, which becomes a significant bottleneck due to the technology constraints of MCMs. Also, we analyze the effectiveness of various algorithms, including a novel limb duplication algorithm, on the MCM architecture. A detailed evaluation shows that a CiFHER package composed of 4 to 64 compact chiplets provides performance comparable to state-of-the-art monolithic ASIC FHE accelerators with significantly lower package-wide power consumption while reducing the area of a single core to as small as 4.28mm$^2$.
翻译:全同态加密(FHE)作为隐私保护的终极解决方案备受关注,但其高昂的计算开销对实际应用构成了挑战。尽管已有研究尝试设计专用集成电路加速器来缓解该问题,但这些设计需要消耗大量芯片资源(如面积)来容纳和处理FHE运算的海量数据。我们提出CiFHER——一种结构可重构的基于芯粒的全同态加密加速器,通过高性价比的多芯片模组(MCM)设计应对上述挑战。首先,我们设计了可灵活配置的芯粒核心架构,其配置可根据芯粒全局组织方式与设计约束进行调整。该核心的独特之处在于具备可重构功能单元,可为全同态加密中最重要的数论变换(NTT)提供可变计算吞吐量。其次,我们建立了通用数据映射方法,以最小化将芯粒以瓦片式组织到MCM封装中时的网络开销——由于MCM工艺限制,这一环节已成为重要瓶颈。此外,我们分析了包括新型肢体复制算法在内的多种算法在MCM架构上的有效性。详细评估表明,由4至64个紧凑芯粒组成的CiFHER封装,在显著降低封装级功耗的同时,可将单核心面积缩减至4.28mm²,性能与最先进的单片式全同态加密ASIC加速器相当。