Fully Homomorphic Encryption (FHE) enables secure computation over encrypted data, but its computational cost remains a major obstacle to practical deployment. To mitigate this overhead, many studies have explored GPU acceleration for the CKKS scheme, which is widely used for approximate arithmetic. In CKKS, CKKS parameters are configured for each workload by balancing multiplicative depth, security requirements, and performance. These parameters significantly affect ciphertext size, thereby determining how the memory footprint fits within the GPU memory hierarchy. Nevertheless, prior studies typically apply their proposed optimization methods uniformly, without considering differences in CKKS parameter configurations. In this work, we demonstrate that the optimal GPU optimization strategy for CKKS depends on the CKKS parameter configuration. We first classify prior optimizations by two aspects of dataflows which affect memory footprint and then conduct both qualitative and quantitative performance analyses. Our analysis shows that even on the same GPU architecture, the optimal strategy varies with CKKS parameters with performance differences of up to 1.98 $\times$ between strategies, and that the criteria for selecting an appropriate strategy differ across GPU architectures.
翻译:全同态加密(FHE)支持在加密数据上进行安全计算,但其计算成本仍是实际部署的主要障碍。为缓解这一开销,许多研究探索了针对CKKS方案的GPU加速,该方案广泛用于近似算术运算。在CKKS中,需通过权衡乘法深度、安全需求和性能来为每个工作负载配置CKKS参数。这些参数会显著影响密文大小,从而决定内存占用如何适配GPU内存层次结构。然而,先前研究通常统一应用其提出的优化方法,未考虑CKKS参数配置的差异。本研究表明,CKKS的最佳GPU优化策略取决于其参数配置。我们首先根据影响内存占用的两个数据流维度对现有优化技术进行分类,随后进行定性与定量性能分析。分析表明,即使在相同GPU架构上,最佳策略亦随CKKS参数而变化,不同策略间性能差异最高可达1.98倍,且适用于不同GPU架构的策略选择标准也存在差异。