Privacy-utility tradeoff remains as one of the fundamental issues of differentially private machine learning. This paper introduces a geometrically inspired kernel-based approach to mitigate the accuracy-loss issue in classification. In this approach, a representation of the affine hull of given data points is learned in Reproducing Kernel Hilbert Spaces (RKHS). This leads to a novel distance measure that hides privacy-sensitive information about individual data points and improves the privacy-utility tradeoff via significantly reducing the risk of membership inference attacks. The effectiveness of the approach is demonstrated through experiments on MNIST dataset, Freiburg groceries dataset, and a real biomedical dataset. It is verified that the approach remains computationally practical. The application of the approach to federated learning is considered and it is observed that the accuracy-loss due to data being distributed is either marginal or not significantly high.
翻译:隐私-效用权衡仍然是差分隐私机器学习的基本问题之一。本文提出了一种基于几何启发核的方法,以缓解分类任务中的准确性损失问题。该方法在再生核希尔伯特空间中学习给定数据点的仿射包表示,从而推导出一种新的距离度量,该度量能够隐藏个体数据点的隐私敏感信息,并通过显著降低成员推断攻击风险来改进隐私-效用权衡。通过MNIST数据集、弗莱堡杂货数据集以及真实生物医学数据集上的实验验证了该方法的有效性,并证实该方法保持计算可行性。本文还进一步探讨了该方法在联邦学习中的应用,观察到因数据分布导致的准确性损失要么是边际性的,要么不显著偏高。