Privacy-utility tradeoff remains as one of the fundamental issues of differentially private machine learning. This paper introduces a geometrically inspired kernel-based approach to mitigate the accuracy-loss issue in classification. In this approach, a representation of the affine hull of given data points is learned in Reproducing Kernel Hilbert Spaces (RKHS). This leads to a novel distance measure that hides privacy-sensitive information about individual data points and improves the privacy-utility tradeoff via significantly reducing the risk of membership inference attacks. The effectiveness of the approach is demonstrated through experiments on MNIST dataset, Freiburg groceries dataset, and a real biomedical dataset. It is verified that the approach remains computationally practical. The application of the approach to federated learning is considered and it is observed that the accuracy-loss due to data being distributed is either marginal or not significantly high.
翻译:隐私-效用权衡仍是差分隐私机器学习中的基本问题之一。本文提出了一种基于几何启发的核方法,用于缓解分类任务中的精度损失问题。该方法在再生核希尔伯特空间中学习给定数据点仿射包的一种表示,由此推导出一种新型距离度量,该度量能够隐藏个体数据点中涉及隐私的敏感信息,并通过显著降低成员推断攻击的风险来改善隐私-效用权衡。通过MNIST数据集、弗莱堡杂货数据集以及一个真实生物医学数据集上的实验验证了该方法的有效性,证实其仍具有计算可行性。进一步探讨了该方法在联邦学习中的应用,观察到因数据分布导致的精度损失要么微不足道,要么并不显著。