Polynomial multiplication is one of the fundamental operations in many applications, such as fully homomorphic encryption (FHE). However, the computational inefficiency stemming from polynomials with many large-bit coefficients poses a significant challenge for the practical implementation of FHE. The Number Theoretic Transform (NTT) has proven an effective tool in enhancing polynomial multiplication, but a fast and adaptable method for generating NTT accelerators is lacking. In this paper, we introduce HF-NTT, a novel NTT accelerator. HF-NTT efficiently handles polynomials of varying degrees and moduli, allowing for a balance between performance and hardware resources by adjusting the number of Processing Elements (PEs). Meanwhile, we introduce a data movement strategy that eliminates the need for bit-reversal operations, resolves different hazards, and reduces the clock cycles. Furthermore, Our accelerator includes a hardware-friendly modular multiplication design and a configurable PE capable of adapting its data path, resulting in a universal architecture. We synthesized and implemented prototype using Vivado 2022.2, and evaluated it on the Xilinx Virtex-7 FPGA platform. The results demonstrate significant improvements in Area-Time-Product (ATP) and processing speed for different polynomial degrees. In scenarios involving multi-modulus polynomial multiplication, our prototype consistently outperforms other designs in both ATP and latency metrics.
翻译:多项式乘法是许多应用中的基本操作,例如全同态加密(FHE)。然而,由具有许多大位宽系数的多项式引起的计算低效性,对FHE的实际实现构成了重大挑战。数论变换(NTT)已被证明是增强多项式乘法的有效工具,但缺乏一种快速且可适配的NTT加速器生成方法。本文中,我们介绍了HF-NTT,一种新颖的NTT加速器。HF-NTT能够高效处理不同阶数和模数的多项式,通过调整处理单元(PE)的数量,可以在性能和硬件资源之间取得平衡。同时,我们引入了一种数据移动策略,该策略消除了对位反转操作的需求,解决了不同的冲突,并减少了时钟周期。此外,我们的加速器包含一个硬件友好的模乘设计和一个能够适配其数据路径的可配置PE,从而形成了一个通用架构。我们使用Vivado 2022.2对原型进行了综合与实现,并在Xilinx Virtex-7 FPGA平台上进行了评估。结果表明,对于不同的多项式阶数,其在面积-时间积(ATP)和处理速度方面均有显著提升。在多模数多项式乘法场景中,我们的原型在ATP和延迟指标上均持续优于其他设计。