Protecting sensitive information in data-driven collaborations, such as AI training, while meeting the diverse requirements of multiple mutually distrusted stakeholders, is both crucial and challenging. This paper presents Styx, a novel framework to address this challenge by integrating sticky policies with Trusted Execution Environments (TEEs). At a high level, Styx employs a hardware-TEE-protected middleware with a programming language runtime to form a sandboxed environment for both the data processing and policy enforcement. We carefully designed a data processing workflow and pipelines to enable a strong yet flexible data-specific policy enforcement throughout the entire data lifecycle and data derivation to achieve data-in-use protection, data lifecycle protection and dynamic collaboration. We implemented Styx and demonstrated its ability to make collaborative computing, such as joint AI training, more secure, privacy-preserving, and policy-compliant. Our evaluation shows the performance overheads imposed by Styx are reasonable on single-node computation with the capability to scale to a large distributed multi-node deployment.
翻译:摘要:在数据驱动型协作(如人工智能训练)中保护敏感信息,同时满足多个互不信任利益相关方的多样化需求,既至关重要又充满挑战。本文提出Styx这一创新框架,通过将粘性策略与可信执行环境(TEE)相结合来应对该挑战。在高层设计中,Styx采用受硬件TEE保护的中间件,配合编程语言运行时环境,构建数据处置与策略执行的沙箱化环境。我们精心设计了数据处理工作流与流水线,以在整个数据生命周期及数据衍生过程中实现强健且灵活的数据特定策略执行,达成数据使用中保护、数据生命周期保护及动态协作三大目标。我们实现了Styx原型系统,并验证其能使联合人工智能训练等协作计算更具安全性、隐私保护性与策略合规性。评估表明,Styx在单节点计算中引入的性能开销处于合理范围,且具备扩展至大规模分布式多节点部署的能力。