Privacy and security challenges in Machine Learning (ML) have become increasingly severe, along with ML's pervasive development and the recent demonstration of large attack surfaces. As a mature system-oriented approach, Confidential Computing has been utilized in both academia and industry to mitigate privacy and security issues in various ML scenarios. In this paper, the conjunction between ML and Confidential Computing is investigated. We systematize the prior work on Confidential Computing-assisted ML techniques that provide i) confidentiality guarantees and ii) integrity assurances, and discuss their advanced features and drawbacks. Key challenges are further identified, and we provide dedicated analyses of the limitations in existing Trusted Execution Environment (TEE) systems for ML use cases. Finally, prospective works are discussed, including grounded privacy definitions for closed-loop protection, partitioned executions of efficient ML, dedicated TEE-assisted designs for ML, TEE-aware ML, and ML full pipeline guarantees. By providing these potential solutions in our systematization of knowledge, we aim to build the bridge to help achieve a much stronger TEE-enabled ML for privacy guarantees without introducing computation and system costs.
翻译:随着机器学习的广泛发展及近期大规模攻击面的显现,机器学习中的隐私与安全挑战日益严峻。作为一种成熟的系统化方法,机密计算已在学术界和工业界被用于缓解各类机器学习场景中的隐私与安全问题。本文深入探究了机器学习与机密计算的结合点。我们系统梳理了现有机密计算辅助的机器学习技术研究,这些技术提供:i) 机密性保证,以及 ii) 完整性保障,并分析了其先进特性与不足之处。进一步地,我们识别出关键挑战,并针对现有可信执行环境系统在机器学习应用场景中的局限性进行了专项分析。最后,本文探讨了未来研究方向,包括:闭环保护的严格隐私定义、高效机器学习的分区执行、面向机器学习的专用TEE辅助设计、TEE感知的机器学习以及机器学习全流程保障。通过在本知识体系化研究中提出这些潜在解决方案,我们旨在搭建桥梁,帮助实现更强大的TEE赋能机器学习隐私保障,同时避免引入额外的计算与系统开销。