Fully Homomorphic Encryption (FHE) is rapidly emerging as a promising foundation for privacy-preserving cloud services, enabling computation directly on encrypted data. As FHE implementations mature and begin moving toward practical deployment in domains such as secure finance, biomedical analytics, and privacy-preserving AI, a critical question remains insufficiently explored: how reliable is FHE computation on real hardware? This question is especially important because, compared with plaintext computation, FHE incurs much higher computational overhead, making it more susceptible to transient hardware faults. Moreover, data corruptions are likely to remain silent: the FHE service has no access to the underlying plaintext, causing unawareness even though the corresponding decrypted result has already been corrupted. To this end, we conduct a comprehensive evaluation of SDCs in FHE ciphertext computation. Through large-scale fault-injection experiments, we characterize the vulnerability of FHE to transient faults, and through a theoretical analysis of error-propagation behaviors, we gain deeper algorithmic insight into the mechanisms underlying this vulnerability. We further assess the effectiveness of different fault-tolerance mechanisms for mitigating these faults.
翻译:全同态加密(FHE)正迅速成为隐私保护云服务的有前景基础,支持直接在加密数据上进行计算。随着FHE实现逐渐成熟并开始向安全金融、生物医学分析和隐私保护AI等领域的实际部署推进,一个关键问题仍未得到充分探讨:FHE计算在实际硬件上的可靠性如何?这个问题尤为重要,因为与明文计算相比,FHE会带来更高的计算开销,使其更容易受到瞬态硬件故障的影响。此外,数据损坏很可能保持静默:FHE服务无法访问底层明文,导致即使相应解密结果已被损坏,服务也无法察觉。为此,我们对FHE密文计算中的静默数据损坏进行了全面评估。通过大规模故障注入实验,我们表征了FHE对瞬态故障的脆弱性;通过错误传播行为的理论分析,我们对该脆弱性背后的机制获得了更深入的算法洞见。我们进一步评估了不同容错机制在减轻这些故障影响方面的有效性。