In real-world settings involving consequential decision-making, the deployment of machine learning systems generally requires both reliable uncertainty quantification and protection of individuals' privacy. We present a framework that treats these two desiderata jointly. Our framework is based on conformal prediction, a methodology that augments predictive models to return prediction sets that provide uncertainty quantification -- they provably cover the true response with a user-specified probability, such as 90%. One might hope that when used with privately-trained models, conformal prediction would yield privacy guarantees for the resulting prediction sets; unfortunately, this is not the case. To remedy this key problem, we develop a method that takes any pre-trained predictive model and outputs differentially private prediction sets. Our method follows the general approach of split conformal prediction; we use holdout data to calibrate the size of the prediction sets but preserve privacy by using a privatized quantile subroutine. This subroutine compensates for the noise introduced to preserve privacy in order to guarantee correct coverage. We evaluate the method on large-scale computer vision datasets.
翻译:在涉及重要决策的现实场景中,机器学习系统的部署通常需要同时具备可靠的置信度量化与个体隐私保护。我们提出一个联合处理这两个需求的框架。该框架基于共形预测方法——一种通过扩展预测模型返回预测集以提供置信度量化的技术,这些预测集能以用户指定的概率(例如90%)可证明地覆盖真实响应。人们或许希望,当与隐私训练的模型结合使用时,共形预测能为生成的预测集提供隐私保障;遗憾的是,事实并非如此。为解决这一关键问题,我们开发了一种方法:将任意预训练预测模型转化为输出差分隐私预测集的方案。该方法遵循分割共形预测的通用流程:使用保留数据校准预测集的规模,并通过私有化分位数子程序保护隐私。该子程序通过补偿为保护隐私所引入的噪声,确保正确的覆盖度。我们在大规模计算机视觉数据集上对该方法进行了评估。