Machine Learning (ML) has emerged as one of data science's most transformative and influential domains. However, the widespread adoption of ML introduces privacy-related concerns owing to the increasing number of malicious attacks targeting ML models. To address these concerns, Privacy-Preserving Machine Learning (PPML) methods have been introduced to safeguard the privacy and security of ML models. One such approach is the use of Homomorphic Encryption (HE). However, the significant drawbacks and inefficiencies of traditional HE render it impractical for highly scalable scenarios. Fortunately, a modern cryptographic scheme, Hybrid Homomorphic Encryption (HHE), has recently emerged, combining the strengths of symmetric cryptography and HE to surmount these challenges. Our work seeks to introduce HHE to ML by designing a PPML scheme tailored for end devices. We leverage HHE as the fundamental building block to enable secure learning of classification outcomes over encrypted data, all while preserving the privacy of the input data and ML model. We demonstrate the real-world applicability of our construction by developing and evaluating an HHE-based PPML application for classifying heart disease based on sensitive ECG data. Notably, our evaluations revealed a slight reduction in accuracy compared to inference on plaintext data. Additionally, both the analyst and end devices experience minimal communication and computation costs, underscoring the practical viability of our approach. The successful integration of HHE into PPML provides a glimpse into a more secure and privacy-conscious future for machine learning on relatively constrained end devices.
翻译:机器学习已成为数据科学中最具变革性和影响力的领域之一。然而,机器学习的广泛采用引发了隐私相关问题,原因是针对机器学习模型的恶意攻击日益增多。为解决这些问题,隐私保护机器学习方法被引入以保障机器学习模型的隐私与安全。其中一种方法是使用同态加密。然而,传统同态加密的显著缺陷和低效率使其难以适用于高度可扩展的场景。幸运的是,一种现代密码学方案——混合同态加密——近期崭露头角,它结合了对称密码学与同态加密的优势以攻克这些挑战。我们的工作旨在通过设计一种面向终端设备的隐私保护机器学习方案,将混合同态加密引入机器学习领域。我们以混合同态加密为基础构建模块,实现对加密数据分类结果的安全学习,同时保护输入数据和机器学习模型的隐私。通过开发并评估一种基于混合同态加密的隐私保护机器学习应用(用于基于敏感心电图数据进行心脏病分类),我们展示了所构建方案的真实世界适用性。值得注意的是,我们的评估显示,与明文数据推理相比,准确性略有降低。此外,分析端和终端设备均承受极低的通信与计算成本,凸显了我们方法的实际可行性。将混合同态加密成功集成到隐私保护机器学习中,为在资源相对受限的终端设备上实现更安全且注重隐私的机器学习未来提供了前景。