Differential privacy has become crucial in the real-world deployment of statistical and machine learning algorithms with rigorous privacy guarantees. The earliest statistical queries, for which differential privacy mechanisms have been developed, were for the release of the sample mean. In Geometric Statistics, the sample Fr\'echet mean represents one of the most fundamental statistical summaries, as it generalizes the sample mean for data belonging to nonlinear manifolds. In that spirit, the only geometric statistical query for which a differential privacy mechanism has been developed, so far, is for the release of the sample Fr\'echet mean: the \emph{Riemannian Laplace mechanism} was recently proposed to privatize the Fr\'echet mean on complete Riemannian manifolds. In many fields, the manifold of Symmetric Positive Definite (SPD) matrices is used to model data spaces, including in medical imaging where privacy requirements are key. We propose a novel, simple and fast mechanism - the \emph{tangent Gaussian mechanism} - to compute a differentially private Fr\'echet mean on the SPD manifold endowed with the log-Euclidean Riemannian metric. We show that our new mechanism has significantly better utility and is computationally efficient -- as confirmed by extensive experiments.
翻译:差分隐私已成为确保统计和机器学习算法在真实世界部署中具有严格隐私保障的关键技术。最早的统计查询(针对这类查询已开发出差分隐私机制)旨在发布样本均值。在几何统计学中,样本Fréchet均值是最基础的统计概要之一,因为它将样本均值推广至属于非线性流形的数据。基于此,目前唯一已开发出差分隐私机制的几何统计查询是样本Fréchet均值的发布:近期提出的《黎曼拉普拉斯机制》用于在完备黎曼流形上对Fréchet均值进行隐私保护。对称正定矩阵流形广泛应用于多个领域的数据空间建模,包括对隐私要求至关重要的医学影像领域。我们提出一种新颖、简单且快速的机制——《切空间高斯机制》——用于在赋予Log-Euclidean黎曼度量的SPD流形上计算差分隐私Fréchet均值。大量实验表明,我们的新机制具有显著更优的效用性及计算高效性。