Privacy and security challenges in Machine Learning (ML) have become increasingly severe, along with ML's pervasive development and the recent demonstration of large attack surfaces. As a mature system-oriented approach, Confidential Computing has been utilized in both academia and industry to mitigate privacy and security issues in various ML scenarios. In this paper, the conjunction between ML and Confidential Computing is investigated. We systematize the prior work on Confidential Computing-assisted ML techniques that provide i) confidentiality guarantees and ii) integrity assurances, and discuss their advanced features and drawbacks. Key challenges are further identified, and we provide dedicated analyses of the limitations in existing Trusted Execution Environment (TEE) systems for ML use cases. Finally, prospective works are discussed, including grounded privacy definitions for closed-loop protection, partitioned executions of efficient ML, dedicated TEE-assisted designs for ML, TEE-aware ML, and ML full pipeline guarantees. By providing these potential solutions in our systematization of knowledge, we aim at building the bridge to help achieve a much strong TEE-enabled ML for privacy guarantees without introducing computation and system costs.
翻译:随着机器学习的广泛发展及其庞大攻击面的显现,机器学习中的隐私与安全挑战日益严峻。机密计算作为一种成熟的系统化方法,已在学术界和工业界被用于缓解各类机器学习场景中的隐私与安全问题。本文系统研究了机器学习与机密计算的结合点,对现有基于机密计算辅助的机器学习技术进行系统化梳理——这些技术分别提供i) 机密性保障和ii) 完整性保证,并探讨了它们的先进特性与局限性。我们进一步识别出关键挑战,针对现有可信执行环境系统在机器学习应用中的局限性进行专门分析。最后,展望了未来研究方向,包括:面向闭环保护的严格隐私定义、高效机器学习的分区执行、专用TEE辅助机器学习设计、TEE感知型机器学习,以及机器学习全流程保障。通过在知识系统化中提供这些潜在解决方案,我们旨在搭建桥梁,帮助实现兼具强隐私保障且不引入额外计算与系统开销的TEE赋能机器学习。