In this paper we study the problem of estimating the unknown mean $\theta$ of a unit variance Gaussian distribution in a locally differentially private (LDP) way. In the high-privacy regime ($\epsilon\le 1$), we identify an optimal privacy mechanism that minimizes the variance of the estimator asymptotically. Our main technical contribution is the maximization of the Fisher-Information of the sanitized data with respect to the local privacy mechanism $Q$. We find that the exact solution $Q_{\theta,\epsilon}$ of this maximization is the sign mechanism that applies randomized response to the sign of $X_i-\theta$, where $X_1,\dots, X_n$ are the confidential iid original samples. However, since this optimal local mechanism depends on the unknown mean $\theta$, we employ a two-stage LDP parameter estimation procedure which requires splitting agents into two groups. The first $n_1$ observations are used to consistently but not necessarily efficiently estimate the parameter $\theta$ by $\tilde{\theta}_{n_1}$. Then this estimate is updated by applying the sign mechanism with $\tilde{\theta}_{n_1}$ instead of $\theta$ to the remaining $n-n_1$ observations, to obtain an LDP and efficient estimator of the unknown mean.
翻译:本文研究在本地差分隐私(LDP)框架下估计单位方差高斯分布未知均值$\theta$的问题。在高隐私保护程度($\epsilon\le 1$)条件下,我们提出了一种渐近最小化估计量方差的最优隐私机制。我们的主要技术贡献在于:针对本地隐私机制$Q$,实现了对脱敏数据费希尔信息量的最大化。研究发现,该最大化问题的精确解$Q_{\theta,\epsilon}$是符号机制——该机制对$X_i-\theta$的符号应用随机响应,其中$X_1,\dots, X_n$为独立同分布的原始机密样本。然而,由于该最优本地机制依赖于未知均值$\theta$,我们采用两阶段LDP参数估计方案:首先将样本单元划分为两组,利用前$n_1$个观测值通过$\tilde{\theta}_{n_1}$获得参数$\theta$的一致性(未必高效)估计;随后将符号机制中的$\theta$替换为$\tilde{\theta}_{n_1}$,应用于剩余$n-n_1$个观测值,最终获得具有LDP性质且高效的未知均值估计量。