Differential privacy upper-bounds the information leakage of machine learning models, yet providing meaningful privacy guarantees has proven to be challenging in practice. The private prediction setting where model outputs are privatized is being investigated as an alternate way to provide formal guarantees at prediction time. Most current private prediction algorithms, however, rely on global sensitivity for noise calibration, which often results in large amounts of noise being added to the predictions. Data-specific noise calibration, such as smooth sensitivity, could significantly reduce the amount of noise added, but were so far infeasible to compute exactly for modern machine learning models. In this work we provide a novel and practical approach based on convex relaxation and bound propagation to compute a provable upper-bound for the local and smooth sensitivity of a prediction. This bound allows us to reduce the magnitude of noise added or improve privacy accounting in the private prediction setting. We validate our framework on datasets from financial services, medical image classification, and natural language processing and across models and find our approach to reduce the noise added by up to order of magnitude.
翻译:差分隐私为机器学习模型的信息泄露提供了上界,然而在实践中提供有意义的隐私保证已被证明具有挑战性。对模型输出进行隐私化的私有预测设置正被研究作为在预测时提供形式化保证的替代方案。然而,当前大多数私有预测算法依赖全局敏感性进行噪声校准,这通常导致预测结果被添加大量噪声。数据特定的噪声校准方法(如平滑敏感性)可显著减少添加的噪声量,但迄今为止无法为现代机器学习模型精确计算。本研究提出一种基于凸松弛和边界传播的新颖实用方法,用于计算预测的局部与平滑敏感性的可证明上界。该上界使我们能够在私有预测设置中降低噪声添加量或改进隐私核算。我们在金融服务、医学图像分类和自然语言处理等多个数据集及模型上验证了本框架,发现我们的方法能将添加的噪声量降低高达一个数量级。