In the third round of the NIST Post-Quantum Cryptography standardization project, the focus is on optimizing software and hardware implementations of candidate schemes. The winning schemes are CRYSTALS Kyber and CRYSTALS Dilithium, which serve as a Key Encapsulation Mechanism (KEM) and Digital Signature Algorithm (DSA), respectively. This study utilizes the TaPaSCo open-source framework to create hardware building blocks for both schemes using High-level Synthesis (HLS) from minimally modified ANSI C software reference implementations across all security levels. Additionally, a generic TaPaSCo host runtime application is developed in Rust to verify their functionality through the standard NIST interface, utilizing the corresponding Known Answer Test mechanism on actual hardware. Building on this foundation, the communication overhead for TaPaSCo hardware accelerators on PCIe-connected FPGA devices is evaluated and compared with previous work and optimized AVX2 software reference implementations. The results demonstrate the feasibility of verifying and evaluating the performance of Post-Quantum Cryptography accelerators on real hardware using TaPaSCo. Furthermore, the off-chip accelerator communication overhead of the NIST standard interface is measured, which, on its own, outweighs the execution wall clock time of the optimized software reference implementation of Kyber at Security Level 1.
翻译:在NIST后量子密码标准化项目的第三轮中,重点在于优化候选方案的软件与硬件实现。获胜方案分别为CRYSTALS Kyber(密钥封装机制KEM)和CRYSTALS Dilithium(数字签名算法DSA)。本研究利用TaPaSCo开源框架,通过高层综合(HLS)对最小化修改的ANSI C软件参考实现进行综合,为两种方案的所有安全级别创建硬件构建模块。此外,使用Rust语言开发了通用的TaPaSCo主机运行时应用程序,通过标准NIST接口在实际硬件上利用相应的已知答案测试机制验证其功能。在此基础上,评估了PCIe连接FPGA设备上TaPaSCo硬件加速器的通信开销,并与先前工作及优化的AVX2软件参考实现进行了比较。结果表明,利用TaPaSCo在实际硬件上验证和评估后量子密码加速器性能的可行性。同时,测量了NIST标准接口的片外加速器通信开销——该开销本身即超过了Kyber在安全级别1下优化软件参考实现的执行墙钟时间。