Less is More: Revisiting the Gaussian Mechanism for Differential Privacy

Differential privacy via output perturbation has been a \textit{de facto} standard for releasing query or computation results on sensitive data. However, we identify that all existing Gaussian mechanisms suffer from the curse of full-rank covariance matrices, and hence the expected accuracy losses of these mechanisms equal the trace of the covariance matrix of the noise. To lift this curse, we design a Rank-1 Singular Multivariate Gaussian (R1SMG) mechanism. It achieves $(\epsilon,\delta)$-DP on query results in $\mathbb{R}^M$ by perturbing the results with noise following a singular multivariate Gaussian distribution, whose covariance matrix is a \textbf{randomly} generated rank-1 positive semi-definite matrix. In contrast, the classic Gaussian mechanism and its variants all consider \textbf{deterministic} full-rank covariance matrices. Our idea is motivated by a clue from Dwork et al.'s seminal work on the classic Gaussian mechanism that has been ignored: when projecting multivariate Gaussian noise with a full-rank covariance matrix onto a set of orthonormal basis in $\mathbb{R}^M$, only the coefficient of a single basis can contribute to the privacy guarantee. We make the following contributions. The R1SMG mechanisms achieves $(\epsilon,\delta)$-DP guarantee on query results in $\R^M$, while its expected accuracy loss is lower bounded by $C_R(\Delta_2f)^2$, where $C_R = \frac{2}{\epsilon \psi}$ and $\psi = \Big(\frac{\delta\Gamma(\frac{M-1}{2})}{\sqrt{\pi}\Gamma(\frac{M}{2})}\Big)^{\frac{2}{M-2}}$. We show that $C_R$ has a decreasing trend as $M$ increases, and converges to $\frac{2}{\epsilon}$ as $M$ approaches infinity. Compared with other mechanisms, the R1SMG mechanism is more stable and less likely to generate noise with large magnitude that overwhelms the query results.

翻译：摘要：通过输出扰动实现差分隐私已成为发布敏感数据查询或计算结果的事实标准。然而，我们发现所有现有高斯机制均受限于满秩协方差矩阵的维数诅咒，导致这些机制的预期精度损失等于噪声协方差矩阵的迹。为破除这一诅咒，我们设计了秩1奇异多元高斯（R1SMG）机制。该机制通过使用服从奇异多元高斯分布的噪声扰动结果（其协方差矩阵为**随机**生成的秩1半正定矩阵），在$\mathbb{R}^M$中实现查询结果的$(\epsilon,\delta)$-差分隐私。相比之下，经典高斯机制及其变体均采用**确定性**满秩协方差矩阵。我们的动机源于Dwork等人在经典高斯机制奠基性工作中被忽视的线索：当将具有满秩协方差矩阵的多元高斯噪声投影到$\mathbb{R}^M$的一组标准正交基上时，仅单个基的系数对隐私保证有贡献。我们做出以下贡献：R1SMG机制在$\R^M$中为查询结果提供$(\epsilon,\delta)$-DP保证，同时其预期精度损失下界为$C_R(\Delta_2f)^2$，其中$C_R = \frac{2}{\epsilon \psi}$且$\psi = \Big(\frac{\delta\Gamma(\frac{M-1}{2})}{\sqrt{\pi}\Gamma(\frac{M}{2})}\Big)^{\frac{2}{M-2}}$。我们证明$C_R$随$M$增大呈下降趋势，当$M$趋近无穷时收敛至$\frac{2}{\epsilon}$。与其他机制相比，R1SMG机制更稳定，且不易产生可能淹没查询结果的量级过大的噪声。