It is challenging to balance the privacy and accuracy for federated query processing over multiple private data silos. In this work, we will demonstrate an end-to-end workflow for automating an emerging privacy-preserving technique that uses a deep learning model trained using the Differentially-Private Stochastic Gradient Descent (DP-SGD) algorithm to replace portions of actual data to answer a query. Our proposed novel declarative privacy-preserving workflow allows users to specify "what private information to protect" rather than "how to protect". Under the hood, the system automatically chooses query-model transformation plans as well as hyper-parameters. At the same time, the proposed workflow also allows human experts to review and tune the selected privacy-preserving mechanism for audit/compliance, and optimization purposes.
翻译:在多个私有数据孤岛上的联邦查询处理中,平衡隐私性与准确性具有挑战性。本研究将演示一种端到端工作流,用于自动化一种新兴的隐私保护技术——该技术使用经差分隐私随机梯度下降(DP-SGD)算法训练的深度学习模型,替代部分真实数据来回答查询。我们提出的新型声明式隐私保护工作流允许用户指定"要保护哪些私有信息",而非"如何保护"。在底层,系统自动选择查询-模型转换方案及超参数。同时,该工作流也允许人类专家对选定的隐私保护机制进行审查与调优,以满足审计/合规性要求及优化目的。