Privacy-utility tradeoff remains as one of the fundamental issues of differentially private machine learning. This paper introduces a geometrically inspired kernel-based approach to mitigate the accuracy-loss issue in classification. In this approach, a representation of the affine hull of given data points is learned in Reproducing Kernel Hilbert Spaces (RKHS). This leads to a novel distance measure that hides privacy-sensitive information about individual data points and improves the privacy-utility tradeoff via significantly reducing the risk of membership inference attacks. The effectiveness of the approach is demonstrated through experiments on MNIST dataset, Freiburg groceries dataset, and a real biomedical dataset. It is verified that the approach remains computationally practical. The application of the approach to federated learning is considered and it is observed that the accuracy-loss due to data being distributed is either marginal or not significantly high.
翻译:隐私-效用权衡仍然是差分隐私机器学习的基本问题之一。本文引入了一种几何启发的基于核的方法来缓解分类中的精度损失问题。在该方法中,在再生核希尔伯特空间(RKHS)中学习给定数据点的仿射包表示。这导致了一种新颖的距离度量,该度量隐藏了关于单个数据点的隐私敏感信息,并通过显著降低成员推断攻击的风险来改善隐私-效用权衡。通过在MNIST数据集、弗莱堡杂货数据集和一个真实生物医学数据集上的实验证明了该方法的有效性。验证表明该方法在计算上保持实用。考虑了该方法在联邦学习中的应用,并观察到由于数据分布导致的精度损失要么很小,要么不太显著。