As machine learning has become more relevant for everyday applications, a natural requirement is the protection of the privacy of the training data. When the relevant learning questions are unknown in advance, or hyper-parameter tuning plays a central role, one solution is to release a differentially private synthetic data set that leads to similar conclusions as the original training data. In this work, we introduce an algorithm that enjoys fast rates for the utility loss for sparse Lipschitz queries. Furthermore, we show how to obtain a certificate for the utility loss for a large class of algorithms.
翻译:随着机器学习在日常应用中的日益普及,保护训练数据的隐私成为一项基本要求。当相关学习问题事先未知,或超参数调优起核心作用时,一种解决方案是发布差分隐私合成数据集,使其能得出与原始训练数据相似的结论。本文提出了一种算法,该算法针对稀疏Lipschitz查询实现了快速效用损失速率。此外,我们展示了如何为一大类算法获得效用损失的认证。