Most current approaches for protecting privacy in machine learning (ML) assume that models exist in a vacuum, when in reality, ML models are part of larger systems that include components for training data filtering, output monitoring, and more. In this work, we introduce privacy side channels: attacks that exploit these system-level components to extract private information at far higher rates than is otherwise possible for standalone models. We propose four categories of side channels that span the entire ML lifecycle (training data filtering, input preprocessing, output post-processing, and query filtering) and allow for either enhanced membership inference attacks or even novel threats such as extracting users' test queries. For example, we show that deduplicating training data before applying differentially-private training creates a side-channel that completely invalidates any provable privacy guarantees. Moreover, we show that systems which block language models from regenerating training data can be exploited to allow exact reconstruction of private keys contained in the training set -- even if the model did not memorize these keys. Taken together, our results demonstrate the need for a holistic, end-to-end privacy analysis of machine learning.
翻译:当前大多数保护机器学习(ML)中隐私的方法都假设模型存在于真空中,而现实中ML模型是更大系统的一部分,该系统包括训练数据过滤、输出监控等组件。在本工作中,我们引入了隐私侧信道:利用这些系统级组件以远高于独立模型可能达到的速率提取隐私信息的攻击。我们提出了涵盖整个ML生命周期(训练数据过滤、输入预处理、输出后处理和查询过滤)的四类侧信道,它们既能增强成员推断攻击,也能引发诸如提取用户测试查询等新型威胁。例如,我们证明在应用差分隐私训练之前对训练数据进行去重会创建一条侧信道,该信道完全否定了任何可证明的隐私保证。此外,我们展示那些阻止语言模型再生训练数据的系统可以被利用,从而精确重建训练集中包含的私钥——即使模型并未记忆这些密钥。综合来看,我们的结果证明需要对机器学习进行全面的端到端隐私分析。